How to delete an Amazon S3 bucket with lots of files in it

This may be common knowledge for those of you who live in S3 land, but it took me awhile to figure out. If you're on Mac OS X or have access to a system with Ruby on Rails installed, it's easy to delete a bucket and all of its keys.

1. download s3sync and s3cmd 2. configure s3sync's environment (like it says to do in the README) 3. open a terminal window and run "s3cmd.rb deleteall bucketname" 4. then, if you want to delete the bucket itself, run "s3cmd.rb deletebucket bucketname"

On my MacBook Pro on an EVDO connection, it seems to be deleting 4 keys per second, or just over 4 minutes for 1,000 keys and 7 hours for 100,000 keys. ECHENG.COM alone has around 68,000 files in it. 45 days of backups = about 3 million files. My calculations tell me that it will take 35 days of constant deleting to remove all of the keys, and that's only for ECHENG.COM. Wetpixel is pretty big, too.

I'm screwed. :(

*UPDATE* On my Mac Pro on DSL, it's deleting 10-15 keys per second. That is much better.


success! it's slow, but it looks like it's working

THE HISTORY, IF YOU CARE

I asked my server guys to use S3 as a backup repository for wetpixel.com and echeng.com, and they filled it up with a month and a half of FULL daily backups instead of doing something incremental. My sites are relatively large, and this resulted in 260GB of data spanning hundreds of thousands of files ("keys" in S3). I have been paying $40-60/month for S3 usage, and I wasn't too happy after I discovered that the most recent backup was 6 months old (luckily, I'm told that they have been backing me up to their own drives instead).

I tried to delete the useless buckets, but discovered that you must delete all of the keys in a bucket before you can delete it, and keys have to be deleted one at a time. There have been requests in the S3 forums for a feature that allows bucket deletion without prior removal of files, but the S3 folks replied saying that it was a "low priority."

I'm on the Mac, and normally use S3 Browser.app and Transmit.app to access my S3 buckets. I tried deleting a bucket using Transmit, but it failed repeatedly with an error after it deleted some number of keys (it consistently fails, but can take 15 or 20 minutes before the error). S3 Browser is an app that fails to inspire confidence -- it loads some number of keys at a time in bursts (1000, I think) and appends them to a single, scrolling list. I didn't want to find out what might happen if I let it pull down the entire list of files.

I started looking for other options, and eventually found s3sync and s3cmd. I then discovered that Ruby on Rails comes installed by default on Leopard, which was convenient. And now, I'm watching a scrolling list of files as they are removed from my bucket, which is very satisfying.