The Clonus3 Horror

Parts: the Clonus Horror was an abomination. Clonus3: My Backup Script isn’t too bad.

So, I’ve moved all the media for 8bitb.us (user-facing and admin) to Amazon’s S3 service, and it’s made a huge difference. The poor Apache instance hosting 8bitb.us only has a couple of child processes, so the fewer requests per page Apache has to handle, the better.

To handle syncing media to S3, I decided to modify a little S3 backup script I wrote. We use it at work to back up 800GB of media and database dumps, but keeping a private backup is different than serving public media. Our backups store the entire path of the file, but http://media.8bitb.us/home/alex/django-projects/8bitbus/media/css/global.css is a really ugly URL. So I added support for relative paths. Add a few canned ACLs into the mix, and you’ve got one script that makes private backups or syncs public media.

Clonus3 is fast, but it wasn’t always this way. To retrive metadata from an S3 object, you have to issue a HEAD request for every object, individually. This is syrup-slow and Clonus3 could be about half the size if listing a bucket retrieved said metadata, but I devolve. Jeff Triplett recommended I cache the HEAD calls, so I did. A backup of my laptop, if nothing’s changed, takes 9 minutes 25 seconds without the caching and 8.37 seconds with. To be safe, Clonus3 (by default - you can turn this off if you’re not paranoid) lists the bucket first and uncaches any entries with modified etags. With this option off, a backup of my laptop takes 1.15 seconds - the bytes move so fast, they have blisters. A caveat: Apple doesn’t appear to have included Berkeley DB libraries in OS X, so you may have to jump through fiery hoops to get the caching on a Mac.

Soon, I’ll add support for setting the Expires, Content-Encoding and Content-Type headers based on regexps. In the far future, maybe Clonus3 could gzip text files on upload. Then maybe I’ll add a restore function. And locking, so you can sync two computers over S3, and … HURK [Editor’s note: Here, the author dies of feature creep.]