Amazon S3 (Simple Storage Service) is a cloud based file storage service. In this post I show you how to make backups on Amazon S3 from Linux systems.

amazon

First you need to register on Amazon S3. You pay only for what you use : http://aws.amazon.com/s3/pricing/. A great thing : the five first gigabytes are free :)

On Amazon a « Bucket » is a container where you can store your files. It’s a kind of top-level logical folder (with fine ACL, etc.). Now let’s create a « Bucket » :

1-Create bucket

 

Let’s give a name to your bucket and choose a « region ». A « region » is the datacenter where your datas will be physically stored. Prices vary depending on the region (but Amazon still is a US company.. : http://www.aidanfinn.com/?p=11187).

2-create bucket

 

Ok the bucket was created. Now let’s go the « Security credentials » panel :

3-security-credentials

 

A popup advices you to get started with « Identify and Access Management (IAM) permission », it’s an Amazon best practice. Choose this option.

4-IAM

 

Create a new user…

5-new-user

 

…and give it a username. Don’t forget to tick the « Generate an access key for each User » option :

6-create-user

 

A popup will appear with the Access key ID and secret key. Keep them preciously (in a safe place) :

8-credentials

 

Now click on the « Permissions » and « Attach User Policy » :

9-user-policy

Select « Custom Policy » :

10-custom

This is an example of custom policy allowing all possible actions (s3:*) on the bucket ressource « arn:aws:s3:::testblog.hordez.fr » recursivly (testblog.hordez.fr/*) :

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:ListAllMyBuckets"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::*"
},
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": ["arn:aws:s3:::testblog.hordez.fr","arn:aws:s3:::testblog.hordez.fr/*"]
}
]
}

Learn more about Amazon IAM policies : http://docs.aws.amazon.com/IAM/latest/UserGuide/PoliciesOverview.html

Ok now your Amazon S3 bucket and IAM policy are ready. But how can I use it from a script or a command line tool ?

S3cmd (http://s3tools.org/s3cmd) is a command line tool for uploading, retrieving and managing data in Amazon S3. It is best suited for power users who don’t fear command line. It is also ideal for scripts, automated backups triggered from cron, etc. It’s is an open source project available under GNU Public License v2 (GPLv2) and is free for both commercial and private use.

(An other way you can try is S3FS : http://code.google.com/p/s3fs/)

S3cmd is easly available (http://s3tools.org/debian-ubuntu-repository-for-s3cmd). On CentOS it’s available in the official depot :

# yum install s3cmd
# s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3
Access Key: ABCD1234DEMO
Secret Key: klasd902i3ld90sdaaklasd90023kl9sd

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: mybestpassword
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]:

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't conect to S3 directly
HTTP Proxy server name:

New settings:
Access Key: ABCD1234DEMO
Secret Key: klasd902i3ld90sdaaklasd90023kl9sd
Encryption password: mybestpassword
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Success. Encryption and decryption worked fine :-)

Save settings? [y/N] y
Configuration saved to '/home/antoine/.s3cfg'

That’s all, for example you can sync a folder on your remote bucket :

# s3cmd sync /home/antoine/Documents/ s3://testblog.hordez.fr/daily-backup/$(date +"%Y-%m-%d")/

Or delete recursively a folder :

# s3cmd del --recursive s3://testblog.hordez.fr/daily-backup/$(date +"%Y-%m-%d")/

Possibilities are limitless :

s3cmd --help

That’s all folk.