Find duplicate files in a directory

posted in code with 0 comments

When I photographed heavily/professionally, I was rigorous in how I handled my imported raw files, and master processed (PSD/XCF) files. I was much less rigorous in how I sorted and stored my processed JPG files, to the point that I’ve found several directories with anywhere between hundreds and thousand of images, some or many of them straight duplicates.

For the hell of it, and also because I haven’t touched C# since early 2013, I drew up a simple console application in C# to search for duplicate file in a given directory. I made a good start on it in Bash, but…fuck. Bash is slow and interacting with arrays in Bash leaves me wanting to murder somebody.

Order of the program:

  1. Check directory was provided. Check directory exists. Check it has more than one file.
  2. Get list of files in directory.
  3. Generate MD5 checksums for each given file.
  4. For each checksum:
    i. Check each file after this in the list to see if it has the same sum.
    ii. If a duplicate is found, check if it is on the recorded dupe list.
    iii. If it isn’t on the dupe list, add it.
  5. Run through the file list once for each dupe checksum. Print all file names with the same checksum.

I need to find more little projects like this in C#; it was fun to dust off what I knew.

using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Security.Cryptography;

public class findDupes {
    static void Main(string[] args) {
        CheckBeforeProceeding(args);

        string[] files = Directory.GetFiles(args[0]);
        List<string> filesums = new List<string>();

        foreach (string file in files)
            filesums.Add(GetFileSum(file));

        List<string> dupes = SearchForDupes(filesums);
        PrintDupes(filesums, dupes, files);
    }

    static void PrintDupes(List<string> sums, List<string> dupes, string[] files) {
        // Print output.
        foreach (string dupe in dupes) {
            Console.WriteLine("{0}\n----------", dupe);

            for (int i = 0; i <= (files.Length - 1); i++)
                if (sums[i] == dupe)
                    Console.WriteLine(files[i]);

            Console.WriteLine();
        }
    }

    static List<string> SearchForDupes(List<string> sums) {
        // Search for duplicate files within the given list of sums.
        List<string> dupes = new List<string>();

        for (int i = 0; i <= (sums.Count - 2); i++)
            for (int j = (i + 1); j <= (sums.Count - 2); j++)
                if (sums[i] == sums[j])
                    if (!dupes.Contains(sums[i]))
                        dupes.Add(sums[i]);

        return dupes;
    }

    static void CheckBeforeProceeding(string[] args) {
        // Check things are good with the target dir before proceeding.
        if (args.Length == 0) {
            Console.WriteLine("Error: No directory provided");
            Environment.Exit(1);
        }

        if (!Directory.Exists(args[0])) {
            Console.WriteLine("Error: '{0}' is not a valid directory", args[0]);
            Environment.Exit(2);
        } 

        if (Directory.GetFiles(args[0]).Length == 0) {
            Console.WriteLine("Error: '{0}' does not contain any files", args[0]);
            Environment.Exit(3);
        }

        if (Directory.GetFiles(args[0]).Length == 1) {
            Console.WriteLine("Error: '{0}' only contains 1 file", args[0]);
            Environment.Exit(3);
        }
    }

    static string GetFileSum(string file) {
        // Function scalped from http://stackoverflow.com/a/10520086/1433400
        using (var sum = MD5.Create())
            using (var stream = File.OpenRead(file))
                return BitConverter.ToString(sum.ComputeHash(stream)).Replace("-","").ToLower();
    }
}

Here is some example output:

[mark][new_instagram] $ ~/dupe_find.exe .
d89b812d61bb41d037b1e6704d146d11
----------
./1.jpg
./17.jpg

4c6cc2794010fe3b018038b0e1162744
----------
./132.jpg
./312.jpg

...

Neater output from the program is left as an exercise to the reader.

by Mark -

I feel strangely proud about my first recursive function

posted in code with 0 comments

I need to move the bottom-most of a given set of divs as part of a parallax effect, so I progress down through them until I hit bottom.

function left(amount, obj) {
    $(obj).children().each(function() {
        if ($(this).children().length > 0) {
            left(amount, this);
        } else {
            $(this).css('left', parseInt($(this).css('left')) - amount + 'px');
            wrap(this);
        }
    });
}

function wrap(obj) {
    var x = $(obj).offset().left;
    var y = $(obj).offset().top;
    var w = $(obj).width();
    var h = $(obj).height();

    if (y + h < 0) {
        $(obj).css('top', $(window).height() + 'px');
    } else if (x + w < 0) {
        $(obj).css('left', $(window).width() + 'px'); 
    } else if (x > $(window).width()) {
        $(obj).css('left', 0 - w + 'px');
    } else if (y > $(window).height()) {
        $(obj).css('top', 0 - h + 'px');
    }
}

/toot

by Mark -

HOWTO: Wall of photographs in jQuery/HTML/CSS

posted in code with 0 comments

Gallery in action The other half of the Funcan theme was a flexible photo gallery. Funcan was due at the start of March, and the gallery now at the end, and while I take a break for food I want to jot down some quick notes on progress so far: I kicked around a few different looks, given the simple brief that

  1. The gallery’s appearance must fit with the Funcan theme.
  2. The gallery should be able to be inserted into any given container, and Just Work™.

and truth be told, most of them looked awful, either trite, slow, or overwrought, so I decided for steal from those who have done better (I know, shut up). I am not the first person who has wanted to steal Google or Flickr’s wall of photographs. Some quick investigation showed me that the algorithm used is fairly simple. Implementations can and are complex, but in theory anyone can understand how the wall of photographs works in under five minutes.

There are a bunch of other well-indexed blog posts that cover the algorithm in greater or lesser depth. I’ve chosen to not link any because the author’s invariably talk about about either how elegant their code, or how smart they are for for their understanding of the deeper math involved.

My gripes aside, I do appreciate the posts, but I have never been able to grep as much detail as the push. Not in one go, at least; I need bite-sized chunks, so I sat down with pen and paper, and worked out that for each given row of images:

  1. First make each image the same height, while you preserve proportions. This can be done via CSS.
  2. Find the total length of the row (add the width of each image together).
  3. Divide the length of the row by the desired width (in this case the .gallery div) to get the ratio.
  4. Change the height of the each individual image by this ratio. You can find a live demonstration on my

sandbox or fork the code from Github. Enjoy!

// Gallery, row, and img classes.
// UPDATE GALLERY.CSS IF YOU CHANGE THESE!!!!
var customClass = '.funcan';
var galleryClass = customClass + '-gallery';
var rowClass = customClass + '-row';
var lightboxClass = customClass + '-lightbox';
// Lightbox div elements.
var lightboxElements = [
    lightboxClass + '-close', 
    lightboxClass + '-nav', 
    lightboxClass + '-txt', 
    lightboxClass + '-img'
];
// Lightbox navigation/text div elements.
var lightboxNavigation = ['-left', '-right'];
// Row lengths will inclusively range between min and max.
var rowSizeMin = 4;
var rowSizeMax = 6;
// For mobile views. Fixed width.
var rowSizeTiny = 3;
// Break point before switching to mobile view.
var mobileSize = 880;
// Include image alt text as anchor title?
var altAsTitle = false;
// Debug. Replace anchor hyperlinks with "javascript:void(0)".
var voidHref = true;
// Two-dimensional array of all gallery images on this page.
var galleryImages = [];
// Current lightbox gallery.
var clg = 0;
// Current lightbox image.
var cli = 0;

function addAnchor(obj, addTitle) {
    // Turn static image into a clickable hyperlink. 
    $(obj).children('img').each(function() {
        if (!$(this).parent().is('a')) {
            // Hyperlink and title for anchor.
            var href  = (voidHref === true) ? 'href="javascript:void(0)" ' : 'href="' + $(this).attr('src') + '" ';
            var title = (addTitle === true) ? ' title="' + $(this).attr('alt') + '" ' : ' ';
            $(this).wrap('<a ' + href  + title + ' ></a>');
        }
    });
}

function addGalleryID(obj) {
    // Give each gallery a unique ID.
    var element = 0;

    $(obj).each(function() {
        $(this).attr('id', element++);
    }); 
}

function addRow(obj) {
    // Append a row to gallery.
    // <div class="uber-row"></div>
    var row = rowClass.substring(1);
    $(obj).append('<div class="' + row + '"></div>');
}

function rangedRandom(min, max) {
    // Return a random number between mix and max values, inclusive.
    return Math.floor(Math.random() * (max - min) + min);
}

function addGalleryRows(obj) {
    // Create an array of images, loop through, adding images to rows.
    // 1. Add a row.
    // 2. Insert images until it has between rowSizeMin and rowSizeMax.
    // 3. Insert new row and repeat.
    // 4. Don't insert a new row if only 1 or 2 images are left.
    var imgArr = $(obj).children('img').toArray();
    addRow(obj);

    $(imgArr).each(function (i) {
        // Smaller row size on smaller screens.
        var rowLength = ($(window).width() > mobileSize) ? rangedRandom(rowSizeMin, rowSizeMax) : rowSizeTiny;
        $(rowClass).last().append(imgArr[i]); 

        // Add a new row if the length exceeds our quasi-random size.
        // Do not add a new row if we are at the end of the array and only 1 or 2 images remain.
        // A single-image row is ugly and a thing to be avoided.
        if ($(rowClass).last().children().size() >= rowLength && (imgArr.length - 1 - i) >= 2) {
            addRow(obj);
        }
    });
}

function vertCenter(obj) {
    // Vertically centers the given object on screen.
    $(obj).css('margin-top', $(window).height() * 0.5 - $(obj).height()  * 0.5);
}

function getRowWidth(obj) {
    // Sum the width of a given gallery row by measuring the outerWidth 
    // of each image.
    // $('.row').width() returns an incorrect value.
    var sum = 0;

    $(obj).children('img').each(function () {
        sum += $(this).outerWidth();
    });

    return sum;
}

function updateGallery(obj) {
    // Resizes each row of images such as to evenly space their width and height. 
    var n = 0;
    $(obj).children(rowClass).each(function () {
        // Get total width of row through width of component images.
        var sum = getRowWidth(this);
        // Ratio between gallery width, and row width.
        var ratio = parseFloat($(obj).width() / sum);

        $(this).children('img').each(function() {
            // Change the height of the image by the ratio.
            var changedHeight = Math.round($(this).height() * ratio);
            $(this).css('height', changedHeight + 'px');
            $(this).attr('class', n++);
            $(this).fadeIn(2000);
        });

        // Rounding errors leave a small margin on the right side of the gallery.
        // Each row should ideally be (parent.width() - 1px).
        // Otherwise each row will be 1-2px too width, which causes wrapping.
        var diff = getRowWidth(this) - ($(obj).width() - 1);
        // Resize the last image in line to make it all fit.
        // A smaller row is made slightly larger and vice versa.
        $(this).children('img').last().css('width', $(this).children('img').last().width() - diff + 'px');

        // Add anchor element.
        addAnchor(this, altAsTitle);
    });
}

function addLightbox(obj) {
    // Lightbox should be prepended to <body> in order to avoid conflicts with other CSS.
    var divOpen  = '<div class="';
    var divClose = '"></div>';

    // Attach lightbox to obj.
    $(obj).prepend(divOpen + lightboxClass.substring(1) + '">');

    $(lightboxElements).each(function(i, e) {
        // Attach all child elements to the lightbox. 
        $(lightboxClass).append(divOpen + e.substring(1) + divClose);
    });

    // Close button.
    $(lightboxElements[0]).append('<a href="javascript:void(0)">X</a>');

    $(lightboxNavigation).each(function(i, e) {
        // Navigation elements.
        $(lightboxElements[1]).append(divOpen + lightboxElements[1].substring(1) + e + divClose);
        $(lightboxElements[2]).append(divOpen + lightboxElements[2].substring(1) + e + divClose);
    });

    $(lightboxNavigation).each(function(i, e) {
        // Paragraph elements for text.
        var arrow = (i === 0) ? '<' : '>';
        $(lightboxElements[2] + e).append('<p>');
        $(lightboxElements[1] + e).append('<a href="javascript:void(0)">' + arrow + '</a>');
    });

    // Image div.
    $(lightboxElements[3]).append('<img src=" " alt=" " />');
}

function positionLightbox() {
    $(lightboxClass).css('height', $(window).height() + 'px');
    $(lightboxElements[0]).css('margin-left', $(window).width() - $(lightboxElements[0]).width() + 'px');
    $(lightboxElements[2]).css('margin-top', $(window).height() * 0.9 + 'px');
    $(lightboxElements[2] + ' p').css('line-height', $(window).height() * 0.1 + 'px');
}

function shrinkLightboxImage() {
    // Firefox and Internet Explorer ignore max-width and max-height.
    // Unless the page has explicit dimensions set.
    var img = $(lightboxElements[3] + ' img');

    if (img.width() >= $(window).width() || img.height() >= $(window).height()) {
        var wd = img.width()  - $(window).width();
        var hd = img.height() - $(window).height();

        if (wd >= hd) {
            img.css('max-width', $(window).width() * 0.97);
            img.css('height', 'auto');
        } else {
            img.css('max-height', $(window).height() * 0.97);
            img.css('width', 'auto');
        }
    }
}

function setLightboxImage(imgSrc) {
    var img = $(lightboxElements[3] + ' img');
    img.attr('src', imgSrc);

    img.load(function() {
        // Have to wait for image to load before I center it.
        // Get 0 width/height otherwise.
        shrinkLightboxImage();
        vertCenter(this);
    });
}

function setLightboxText(txt) {
    // Set alt text display.
    var box = $(lightboxElements[2] + lightboxNavigation[0]);
    box.empty();
    box.append('<p>' + txt + '</p>');
}

function setLightboxCount(current, total) {
    // Set current / total count.
    var box = $(lightboxElements[2] + lightboxNavigation[1] + ' p');
    box.text(current + '/' + total);
}

function updateLightbox(obj) {
    // Pass image to this.
    setLightboxImage(obj.src);
    setLightboxText($(obj).attr('alt'));
    setLightboxCount(parseInt($(obj).attr('class')) + 1, galleryImages[clg].length);
    positionLightbox();
}

function decrementLightboxImage() {
    cli -= (cli <= 0) ? 0 : 1; 
    updateLightbox(galleryImages[clg][cli]);
}

function incrementLightboxImage() {
    cli += (cli >= galleryImages[clg].length - 1) ? 0 : 1; 
    updateLightbox(galleryImages[clg][cli]);
}

$(window).load(function() {
    // Gallery load events.
    if ($(galleryClass).length > 0) {
        addGalleryID(galleryClass);

        $(galleryClass).each(function() { 
            var tmp = [];

            $(this).children('img').each(function() {
                tmp.push(this);
            });

            galleryImages.push(tmp);
            addGalleryRows(this);
            updateGallery(this);
        });

        // Lightbox load events.
        addLightbox('body');
        updateLightbox(galleryImages[clg][cli]);
    }
});

$(window).load(function() {
    // Mouse click and keypress events.
    $(galleryClass + ' img').click(function() {
        clg = parseInt($(this).closest(galleryClass).attr('id'));
        cli = parseInt($(this).attr('class'));
        updateLightbox(galleryImages[clg][cli]);
        $(lightboxClass).toggle(); 
    });

    $(lightboxElements[0] + ' a').click(function() {
        $(lightboxClass).toggle(); 
    });

    $(lightboxElements[1] + lightboxNavigation[0]).click(function() {
        decrementLightboxImage();
    });

    $(lightboxElements[1] + lightboxNavigation[1]).click(function() {
        incrementLightboxImage();
    });

    $('body').keyup(function(key) {
        switch (key.keyCode) {
            case 27: $(lightboxClass).toggle(); break;
            case 37: decrementLightboxImage(); break;
            case 39: incrementLightboxImage(); break;
            default: break;
        }
    });
});

by Mark -

Funcan

posted in code with 0 comments

Although his site is offline at this exact moment in time, 091 Labs’s Duncan Thomas asked me to write the new theme for his site. “Funcan” is a dark, responsive theme inspired by terminal output on a Linux system. There are strong contrasts between the dark background and bright text, modern aesthetics in the header and layout, a fluid and responsive design. You can play with the theme temporarily on Peppermint, fork the repo on Github, or just gaze slackly in astonishment at the screenshots:

Funcan theme on the desktop
Funcan theme on iOS

by Mark -

Linux course post-mortem

posted in code with 1 comment

CHANGE ME

There was a dearth of classes going on at 091 Labs over the summer, so I stepped up to the plate at the start of August and advertised a Linux programming class, spread over four weeks: Learn the fundamentals of programming through the Bourne Again Shell, and all its queerness and oddities of syntax. The first class was held on August 19 in the hackerspace with the three successive classes each held on the following Monday. The crowd was small, and only shrunk, but compared to some classes I’ve attended and run, they were pretty attentive.

I, me, myself, I too learned a lot from this class, both in and out of Bash: Why certain things were handled by Bash in a certain manner (subshells comes to mind-I’ve been using them since day one, but never really thought about how they worked); how to perform tasks I thought difficult or weird (advanced awk and sed operations); how to parse the material for the class in an understandable manner; and odd as it is to admit, how to make eye contact and just interact in a relaxed and confident manner, given how long I’ve been avoiding just this.

The Bash part of the learning was fun, and I can comfortably say that I came out of each week with a better understanding than I did going in. The case structure, variables, oddities, efficiently testing items, parsing integers, math operations, and even some binary operations too were all items I picked up from basically scratch.

I’ve also built up a good body of material that I can just turn around and give to someone who wants to learn the Bourne shell, as simply as just sharing the Google Drive folder with them. The slides are concise if sparse, and cover the fundamental operations without delving into anything Linux-specific: You can follow through the slides and examples through Terminal on OS X, or Cygwin on Windows.

On the classes themselves, I learned some things: A Facebook event is where workshops go to die; Facebook users are as conditioned to ignore event invitations as they are blinking advertisements. The 091labs.com blog post, word of mouth through members and friends, and repeated postings on the 091 Labs twitter and Facebook accounts were what brought people in the door.

I did some things right, and others wrong:

  • I should have had printed handouts with exercises and solutions available.
  • A split-screen terminal (Terminator) was excellent: I could simultaneously show creation and execution without having to either tab away or change desktop.
  • Offering Cygwin and SSH access was a fantastic idea. It saved me from having to mess with people installing Linux for the first time.
  • Problems with Windows’ line-endings cost me time and frustration. Everything seems coded okay, but then you run it and whoops. Ditto this and people deciding to try vim despite my stern warnings away from it.
  • RE the above: Next time I need to mandate a good text editor for Windows and OSX, either Sublime Text. If you use Notepad or Wordpad or TextEdit-or any kind of rich text editor for that matter, you deserve to be shunned.
  • Relying on slides for the first week of solutions limited me and only served to fluff out the slideshow. Coding and uploading solutions separately allowed me to append notes at length.

The final workshop is tomorrow, and instead of a segue into C#, I will probably stick with more real-world examples of scripts, give more exercises, and invite my students to offer up problems for me to handle programmatically.

In the end it was fun, the money brought in helped the hackerspace, and I can port this material to other programming languages (C#, probably) with minimal effort. B++, would run again.

by Mark -

Round down a decimal of arbitrary precision to n.

posted in code with 0 comments

For Darren. :D

using System;

public class DecimalRound {
	static void Main(string[] args) {
		// args[0] is the decimal to round to.
		// args[1] is the decimal.
		int sentinel = 0;
		// 1. Split the number. decimalStr[0] is the leading number.
		string[] decimalStr = args[1].Split('.');
		// 2. decimalStr[1] is the decimal.
		int[] decimalInt = new int[decimalStr[1].Length];
		// 3. Parse sentinel value.
		if (!int.TryParse(args[0], out sentinel))
			Fail(args[0], true);
 
		// Convert the decimals characters to ints, after validation.
		int n = 0;
		foreach (char a in decimalStr[1]) {
			if (ValidChar(a))
				decimalInt[n] = CharToInt(a);
			else
				decimalInt[n] = -1;

			n++;
		}

		// 0-4, round down. 5-9, round up.
		for (int i = decimalInt.Length - 1; i > sentinel; i--) {
			if (i > 0) {
				if ((decimalInt[i] >= 0) && (decimalInt[i] <= 4))
					decimalInt[i - 1]--;
				else if ((decimalInt[i] >= 5) && (decimalInt[i] <= 9))
					decimalInt[i - 1]++;
			}
		}

		// Output. 
		Console.Write(decimalStr[0] + ".");
		for (int i = 0; i <= sentinel; i++)
			Console.Write(decimalInt[i]);
		Console.WriteLine();
	}

	static bool ValidChar(char a) {
		// Validates that it is 0-9, and not any other character value.
		switch (a) {
			case '0': return true;
			case '1': return true;
			case '2': return true;
			case '3': return true;
			case '4': return true;
			case '5': return true;
			case '6': return true;
			case '7': return true;
			case '8': return true;
			case '9': return true;
		}

		return false;
	}

	static int CharToInt(char a) {
		// Converts, if valid. 
		return Convert.ToInt32(a - '0');
	}

	static void Fail(string a, bool b) {
		// Spit out error message if you pass invalid characters.
		// b = true if this is fatal.
		Console.WriteLine("{0} is not valid.", a);

		if (b)
			Environment.Exit(1);
	}
}

by Mark -

[MUD] Muddy mudness

posted in code, me with 0 comments

I’ve thrown a lot of idle thoughts at the concept of the MUD project, and it keeps coming back to me that I would be literally out of my depth. I have, to date, scarcely finished one barely-working shooter, let alone delved into the intricacies of a graphical multiplayer role-playing game.

I am going to throw myself into an easier intermediate project, a Linux arena shooter game. The core principles of the arena game will carry over to the MUD:

  • Peer-to-peer and client-server multiplayer connectivity.
  • Persistent characters, and player progression.
  • AI pathing.
  • And more…

My first steps are to finish the design document and figure out the Monogame networking API. :)

by Mark -

School update

posted in college with 0 comments

I wrote code for the first time since last summer, and with coding, comes some fairly hard choices what to do with myself after the summer. I’ve burned a lot of bridges over the last year, burned them very well indeed. College, over the past year, has been a complete loss.

I’ve been on time and in class for less than ten days out of ~130 of the two semesters gone, and I have zero idea of what the content of the year gone has been, although from those days that I have been in, it doesn’t seem ridiculous advanced; call any difficulty to be that of quantity, not quality.

Now, the choices:

  1. Write the last year off as a loss to depression, scrape up ~€1300 between now and Hallowe’en. I have some web work to finish, and could probably canvas more, so this isn’t as ridiculous as proposition as it seems.
  2. Cut my loss, admit academic defeat, and seek work in Galway. I’m passably good at Adobe Photoshop, HTML, CSS, and design, and I plan to move back to Galway for the summer anyway come May.

There isn’t an easy choice either way. Sligo is soul-crushingly boring, completely lacks in any social outlet for an introvert who doesn’t like pubs, and has marginally fewer amenities than, say, an arbitrary square kilometre of the Mojave Desert; the course material at college is shallow, full of gaping holes (see: my rants about the math from last September) and irrelevant crud, but it is a foundation I still absolutely need if I am going to do any work with code.

On the other hand, Sligo is a ridiculously beautiful county with plenty of outdoor activities.

There’s an excellent resource in Galway in the form of 091 Labs and its eclectic cadre of unwashed nerds, but that isn’t a substitute for Real Learning™.

In short, Sligo doesn’t have a lot that I want, but it has everything I need.

by Mark -

[Bash] Batch renaming files

posted in code with 0 comments

Despite (or perhaps because of) the fact I’ve been using Linux for more than a decade, I’ve done a great job of avoiding sed. I have done my best to make up for it in the recent past by learning how to at least find and replace, either within a file or in a filename. Here is how you batch rename files using sed in a Bash shell.

To rename all *.txt files in a directory (in this case to replace any spaces with _.

for file in *.txt; do mv "$file" "$(echo $file | sed -e 's/ /_/g')"; done

To unpack:

for file in *.txt

For each text (.txt) file in this directory.

do mv "$file"

Move the file from it’s current name to

"$(echo $file | sed -e 's/ /_/g')"

this. In this second part, we echo the name of the file and pipe the name into sed. The syntax for find and replace in sed is: sed -e 's/foo/bar/g where foo is the old string and bar is the new string. You echo the file name, and simply replace the ” ” with “_”.

I sincerely hope this is useful to somebody, somewhere.

by Mark -