PageRenderTime 126ms CodeModel.GetById 107ms app.highlight 12ms RepoModel.GetById 2ms app.codeStats 0ms


ReStructuredText | 245 lines | 191 code | 54 blank | 0 comment | 0 complexity | d9c7a55fc87eb2134ab0541fd2f3fc61 MD5 | raw file
  1.. _s3_tut:
  4An Introduction to boto's S3 interface
  7This tutorial focuses on the boto interface to the Simple Storage Service
  8from Amazon Web Services.  This tutorial assumes that you have already
  9downloaded and installed boto.
 11Creating a Connection
 13The first step in accessing S3 is to create a connection to the service.
 14There are two ways to do this in boto.  The first is:
 16>>> from boto.s3.connection import S3Connection
 17>>> conn = S3Connection('<aws access key>', '<aws secret key>')
 19At this point the variable conn will point to an S3Connection object.  In
 20this example, the AWS access key and AWS secret key are passed in to the
 21method explicitely.  Alternatively, you can set the environment variables:
 23AWS_ACCESS_KEY_ID - Your AWS Access Key ID
 24AWS_SECRET_ACCESS_KEY - Your AWS Secret Access Key
 26and then call the constructor without any arguments, like this:
 28>>> conn = S3Connection()
 30There is also a shortcut function in the boto package, called connect_s3
 31that may provide a slightly easier means of creating a connection:
 33>>> import boto
 34>>> conn = boto.connect_s3()
 36In either case, conn will point to an S3Connection object which we will
 37use throughout the remainder of this tutorial.
 39Creating a Bucket
 42Once you have a connection established with S3, you will probably want to
 43create a bucket.  A bucket is a container used to store key/value pairs
 44in S3.  A bucket can hold an unlimited amount of data so you could potentially
 45have just one bucket in S3 for all of your information.  Or, you could create
 46separate buckets for different types of data.  You can figure all of that out
 47later, first let's just create a bucket.  That can be accomplished like this:
 49>>> bucket = conn.create_bucket('mybucket')
 50Traceback (most recent call last):
 51  File "<stdin>", line 1, in ?
 52  File "boto/", line 285, in create_bucket
 53    raise S3CreateError(response.status, response.reason)
 54boto.exception.S3CreateError: S3Error[409]: Conflict
 56Whoa.  What happended there?  Well, the thing you have to know about
 57buckets is that they are kind of like domain names.  It's one flat name
 58space that everyone who uses S3 shares.  So, someone has already create
 59a bucket called "mybucket" in S3 and that means no one else can grab that
 60bucket name.  So, you have to come up with a name that hasn't been taken yet.
 61For example, something that uses a unique string as a prefix.  Your
 62AWS_ACCESS_KEY (NOT YOUR SECRET KEY!) could work but I'll leave it to
 63your imagination to come up with something.  I'll just assume that you
 64found an acceptable name.
 66The create_bucket method will create the requested bucket if it does not
 67exist or will return the existing bucket if it does exist.
 69Creating a Bucket In Another Location
 72The example above assumes that you want to create a bucket in the
 73standard US region.  However, it is possible to create buckets in
 74other locations.  To do so, first import the Location object from the
 75boto.s3.connection module, like this:
 77>>> from boto.s3.connection import Location
 78>>> dir(Location)
 79['DEFAULT', 'EU', 'USWest', 'APSoutheast', '__doc__', '__module__']
 82As you can see, the Location object defines three possible locations;
 83DEFAULT, EU, USWest, and APSoutheast.  By default, the location is the
 84empty string which is interpreted as the US Classic Region, the
 85original S3 region.  However, by specifying another location at the
 86time the bucket is created, you can instruct S3 to create the bucket
 87in that location.  For example:
 89>>> conn.create_bucket('mybucket', location=Location.EU)
 91will create the bucket in the EU region (assuming the name is available).
 93Storing Data
 96Once you have a bucket, presumably you will want to store some data
 97in it.  S3 doesn't care what kind of information you store in your objects
 98or what format you use to store it.  All you need is a key that is unique
 99within your bucket.
101The Key object is used in boto to keep track of data stored in S3.  To store
102new data in S3, start by creating a new Key object:
104>>> from boto.s3.key import Key
105>>> k = Key(bucket)
106>>> k.key = 'foobar'
107>>> k.set_contents_from_string('This is a test of S3')
109The net effect of these statements is to create a new object in S3 with a
110key of "foobar" and a value of "This is a test of S3".  To validate that
111this worked, quit out of the interpreter and start it up again.  Then:
113>>> import boto
114>>> c = boto.connect_s3()
115>>> b = c.create_bucket('mybucket') # substitute your bucket name here
116>>> from boto.s3.key import Key
117>>> k = Key(b)
118>>> k.key = 'foobar'
119>>> k.get_contents_as_string()
120'This is a test of S3'
122So, we can definitely store and retrieve strings.  A more interesting
123example may be to store the contents of a local file in S3 and then retrieve
124the contents to another local file.
126>>> k = Key(b)
127>>> k.key = 'myfile'
128>>> k.set_contents_from_filename('foo.jpg')
129>>> k.get_contents_to_filename('bar.jpg')
131There are a couple of things to note about this.  When you send data to
132S3 from a file or filename, boto will attempt to determine the correct
133mime type for that file and send it as a Content-Type header.  The boto
134package uses the standard mimetypes package in Python to do the mime type
135guessing.  The other thing to note is that boto does stream the content
136to and from S3 so you should be able to send and receive large files without
137any problem.
139Listing All Available Buckets
141In addition to accessing specific buckets via the create_bucket method
142you can also get a list of all available buckets that you have created.
144>>> rs = conn.get_all_buckets()
146This returns a ResultSet object (see the SQS Tutorial for more info on
147ResultSet objects).  The ResultSet can be used as a sequence or list type
148object to retrieve Bucket objects.
150>>> len(rs)
152>>> for b in rs:
153... print
155<listing of available buckets>
156>>> b = rs[0]
158Setting / Getting the Access Control List for Buckets and Keys
160The S3 service provides the ability to control access to buckets and keys
161within s3 via the Access Control List (ACL) associated with each object in
162S3.  There are two ways to set the ACL for an object:
1641. Create a custom ACL that grants specific rights to specific users.  At the
165   moment, the users that are specified within grants have to be registered
166   users of Amazon Web Services so this isn't as useful or as general as it
167   could be.
1692. Use a "canned" access control policy.  There are four canned policies
170   defined:
171   a. private: Owner gets FULL_CONTROL.  No one else has any access rights.
172   b. public-read: Owners gets FULL_CONTROL and the anonymous principal is granted READ access.
173   c. public-read-write: Owner gets FULL_CONTROL and the anonymous principal is granted READ and WRITE access.
174   d. authenticated-read: Owner gets FULL_CONTROL and any principal authenticated as a registered Amazon S3 user is granted READ access.
176To set a canned ACL for a bucket, use the set_acl method of the Bucket object.
177The argument passed to this method must be one of the four permissable
178canned policies named in the list CannedACLStrings contained in
179For example, to make a bucket readable by anyone:
181>>> b.set_acl('public-read')
183You can also set the ACL for Key objects, either by passing an additional
184argument to the above method:
186>>> b.set_acl('public-read', 'foobar')
188where 'foobar' is the key of some object within the bucket b or you can
189call the set_acl method of the Key object:
191>>> k.set_acl('public-read')
193You can also retrieve the current ACL for a Bucket or Key object using the
194get_acl object.  This method parses the AccessControlPolicy response sent
195by S3 and creates a set of Python objects that represent the ACL.
197>>> acp = b.get_acl()
198>>> acp
199<boto.acl.Policy instance at 0x2e6940>
200>>> acp.acl
201<boto.acl.ACL instance at 0x2e69e0>
202>>> acp.acl.grants
203[<boto.acl.Grant instance at 0x2e6a08>]
204>>> for grant in acp.acl.grants:
205...   print grant.permission, grant.display_name, grant.email_address,
207FULL_CONTROL <boto.user.User instance at 0x2e6a30>
209The Python objects representing the ACL can be found in the module
210of boto.
212Both the Bucket object and the Key object also provide shortcut
213methods to simplify the process of granting individuals specific
214access.  For example, if you want to grant an individual user READ
215access to a particular object in S3 you could do the following:
217>>> key = b.lookup('mykeytoshare')
218>>> key.add_email_grant('READ', '')
220The email address provided should be the one associated with the users
221AWS account.  There is a similar method called add_user_grant that accepts the
222canonical id of the user rather than the email address.
224Setting/Getting Metadata Values on Key Objects
226S3 allows arbitrary user metadata to be assigned to objects within a bucket.
227To take advantage of this S3 feature, you should use the set_metadata and
228get_metadata methods of the Key object to set and retrieve metadata associated
229with an S3 object.  For example:
231>>> k = Key(b)
232>>> k.key = 'has_metadata'
233>>> k.set_metadata('meta1', 'This is the first metadata value')
234>>> k.set_metadata('meta2', 'This is the second metadata value')
235>>> k.set_contents_from_filename('foo.txt')
237This code associates two metadata key/value pairs with the Key k.  To retrieve
238those values later:
240>>> k = b.get_key('has_metadata)
241>>> k.get_metadata('meta1')
242'This is the first metadata value'
243>>> k.get_metadata('meta2')
244'This is the second metadata value'