Better s3cmd put

s3_put.sh at work

I’ve been using s3cmd to back up the contents of an external disk to Amazon’s S3 service, and honestly, s3cmd is kinda crap. It’s the most accessible tool for S3, and it’s free, but it’s kinda crap. The biggest problem I have stems from when I start, stop, and then resume an upload; already-uploaded files get uploaded again, which means I have to start from scratch, or manually exclude (painfully) already-archived files. sync should skip existing files, but in practice, it simply does not. put also randomly skips files that I can upload fine individually.

This script takes the bucket and folder(s), descends recursively into each given folder, and individually validates that each file exists before it is uploaded. This isn’t the best solution if you want to put files into a deep folder on the bucket, because in my case I want all of the passed folders to wind up in the root of the bucket.

Update: I’ve had this running for about three days now, and given all of the problems I had in the past, my wrapper works amazingly well. s3cmd put skipped every other folder, and s3cmd sync stoically restarted and uploaded existing files every time I ran it.

#!/bin/bash
# Call with command <bucket> <director{y,ies}>
# Email mark@bhalash.com for help. 

print_time() {
    # Appends timestamp to start of next line.
    echo -n "$(date +%H:%M:%S) "
}

check_deps() {
    # Check for missing dependencies and exit if they aren't found.
    dep[1]="/usr/bin/s3cmd"

    dep_missing=0
    for n in $(seq ${#dep[@]}); do
        if [ ! -x ${dep[$n]} ]; then
            print_time
            echo "ERROR: '${dep[$n]}' not found"
            dep_missing=1
        fi
    done

    if [[ $dep_missing == 1 ]]; then
        exit 1
    fi
}

check_bucket_exists() {
    # Test for the existence of the given bucket, and exit if it isn't found.
    s3cmd ls | grep "$1"
    
    if [[ $? == 1 ]]; then
        print_time
        echo "ERROR: Bucket '$1' not found"
        exit 1
    fi
}

s3put() {
    # Test for the existence of $file in the remote bucket, and upload it if it
    # doesn't exist. s3cmd's sync/put command continued to upload 
    # already-existing files. 
    temp="/tmp/s3_$(date +%Y%m%d%H%M%S%N)"

    cd "$1" && cur_dir=$(basename -a $PWD)
    s3cmd ls -r $2/$cur_dir | sed -e 's/^.*  //g' > $temp 
    
    for file in $(find . -type f -iname "*" | sed -e 's/^.\///g'); do
        grep "$file" < $temp > /dev/null 2>&1

        if [[ $? == 1 ]]; then
            print_time
            # s3cmd generates excessively verbose output, so I parse it.
            s3cmd --disable-multipart put "$file" $2/"$cur_dir"/"$file" 2<&1 \
            | grep -v "WARNING" | sed -e 's/ \[.*\]//g' 
        else
            print_time
            echo "File '$file' is already stored as '$2/$cur_dir/$file'"
        fi
    done

    rm $temp > /dev/null 2>&1
}

check_deps
check_bucket_exists $1
bucket=$1
shift

if [[ $# == 0 ]]; then
    print_time
    echo "ERROR: No directories provided"
    exit 1
fi

for file in "$@"; do
    if [ -d "$1" ]; then
        s3put "$1" $bucket 
    fi

    shift
done

exit 0


Is the hackerspace open?

This started as my own tongue-in-cheek take on 091 Labs’s Lo-lo project, an open/closed state indicator for the hackerspace. It has the dangerous potential to become something moderately more obsessive, because the four lines of jQuery at its core don’t fucking work in Internet Explorer (shocker). The page, the CSS? Validates perfectly. The JavaScript I am less sure about, although I am reassured by the fact that none of the browsers throw up any errors.

Incidentally, Firefox has the smoothest font rendering.






Ubuntu died

I did an in-place do-release-upgrade yesterday. The laptop was not completely plugged in, and the battery went out when the upgrade process was about 75% complete. Since then, things have been…very interesting due to the lack of 64-bit recovery media in the hackerspace.