PageRenderTime 127ms CodeModel.GetById 52ms app.highlight 21ms RepoModel.GetById 37ms app.codeStats 0ms

/docs/topics/serialization.txt

https://code.google.com/p/mango-py/
Plain Text | 402 lines | 294 code | 108 blank | 0 comment | 0 complexity | dc66cfa0847aa81f948db19db6cb3179 MD5 | raw file
  1==========================
  2Serializing Django objects
  3==========================
  4
  5Django's serialization framework provides a mechanism for "translating" Django
  6objects into other formats. Usually these other formats will be text-based and
  7used for sending Django objects over a wire, but it's possible for a
  8serializer to handle any format (text-based or not).
  9
 10.. seealso::
 11
 12    If you just want to get some data from your tables into a serialized
 13    form, you could use the :djadmin:`dumpdata` management command.
 14
 15Serializing data
 16----------------
 17
 18At the highest level, serializing data is a very simple operation::
 19
 20    from django.core import serializers
 21    data = serializers.serialize("xml", SomeModel.objects.all())
 22
 23The arguments to the ``serialize`` function are the format to serialize the data
 24to (see `Serialization formats`_) and a :class:`~django.db.models.QuerySet` to
 25serialize. (Actually, the second argument can be any iterator that yields Django
 26objects, but it'll almost always be a QuerySet).
 27
 28You can also use a serializer object directly::
 29
 30    XMLSerializer = serializers.get_serializer("xml")
 31    xml_serializer = XMLSerializer()
 32    xml_serializer.serialize(queryset)
 33    data = xml_serializer.getvalue()
 34
 35This is useful if you want to serialize data directly to a file-like object
 36(which includes an :class:`~django.http.HttpResponse`)::
 37
 38    out = open("file.xml", "w")
 39    xml_serializer.serialize(SomeModel.objects.all(), stream=out)
 40
 41Subset of fields
 42~~~~~~~~~~~~~~~~
 43
 44If you only want a subset of fields to be serialized, you can
 45specify a ``fields`` argument to the serializer::
 46
 47    from django.core import serializers
 48    data = serializers.serialize('xml', SomeModel.objects.all(), fields=('name','size'))
 49
 50In this example, only the ``name`` and ``size`` attributes of each model will
 51be serialized.
 52
 53.. note::
 54
 55    Depending on your model, you may find that it is not possible to
 56    deserialize a model that only serializes a subset of its fields. If a
 57    serialized object doesn't specify all the fields that are required by a
 58    model, the deserializer will not be able to save deserialized instances.
 59
 60Inherited Models
 61~~~~~~~~~~~~~~~~
 62
 63If you have a model that is defined using an :ref:`abstract base class
 64<abstract-base-classes>`, you don't have to do anything special to serialize
 65that model. Just call the serializer on the object (or objects) that you want to
 66serialize, and the output will be a complete representation of the serialized
 67object.
 68
 69However, if you have a model that uses :ref:`multi-table inheritance
 70<multi-table-inheritance>`, you also need to serialize all of the base classes
 71for the model. This is because only the fields that are locally defined on the
 72model will be serialized. For example, consider the following models::
 73
 74    class Place(models.Model):
 75        name = models.CharField(max_length=50)
 76
 77    class Restaurant(Place):
 78        serves_hot_dogs = models.BooleanField()
 79
 80If you only serialize the Restaurant model::
 81
 82    data = serializers.serialize('xml', Restaurant.objects.all())
 83
 84the fields on the serialized output will only contain the `serves_hot_dogs`
 85attribute. The `name` attribute of the base class will be ignored.
 86
 87In order to fully serialize your Restaurant instances, you will need to
 88serialize the Place models as well::
 89
 90    all_objects = list(Restaurant.objects.all()) + list(Place.objects.all())
 91    data = serializers.serialize('xml', all_objects)
 92
 93Deserializing data
 94------------------
 95
 96Deserializing data is also a fairly simple operation::
 97
 98    for obj in serializers.deserialize("xml", data):
 99        do_something_with(obj)
100
101As you can see, the ``deserialize`` function takes the same format argument as
102``serialize``, a string or stream of data, and returns an iterator.
103
104However, here it gets slightly complicated. The objects returned by the
105``deserialize`` iterator *aren't* simple Django objects. Instead, they are
106special ``DeserializedObject`` instances that wrap a created -- but unsaved --
107object and any associated relationship data.
108
109Calling ``DeserializedObject.save()`` saves the object to the database.
110
111This ensures that deserializing is a non-destructive operation even if the
112data in your serialized representation doesn't match what's currently in the
113database. Usually, working with these ``DeserializedObject`` instances looks
114something like::
115
116    for deserialized_object in serializers.deserialize("xml", data):
117        if object_should_be_saved(deserialized_object):
118            deserialized_object.save()
119
120In other words, the usual use is to examine the deserialized objects to make
121sure that they are "appropriate" for saving before doing so.  Of course, if you
122trust your data source you could just save the object and move on.
123
124The Django object itself can be inspected as ``deserialized_object.object``.
125
126.. _serialization-formats:
127
128Serialization formats
129---------------------
130
131Django supports a number of serialization formats, some of which require you
132to install third-party Python modules:
133
134    ==========  ==============================================================
135    Identifier  Information
136    ==========  ==============================================================
137    ``xml``     Serializes to and from a simple XML dialect.
138
139    ``json``    Serializes to and from JSON_ (using a version of simplejson_
140                bundled with Django).
141
142    ``yaml``    Serializes to YAML (YAML Ain't a Markup Language). This
143                serializer is only available if PyYAML_ is installed.
144    ==========  ==============================================================
145
146.. _json: http://json.org/
147.. _simplejson: http://undefined.org/python/#simplejson
148.. _PyYAML: http://www.pyyaml.org/
149
150Notes for specific serialization formats
151~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152
153json
154^^^^
155
156If you're using UTF-8 (or any other non-ASCII encoding) data with the JSON
157serializer, you must pass ``ensure_ascii=False`` as a parameter to the
158``serialize()`` call. Otherwise, the output won't be encoded correctly.
159
160For example::
161
162    json_serializer = serializers.get_serializer("json")()
163    json_serializer.serialize(queryset, ensure_ascii=False, stream=response)
164
165The Django source code includes the simplejson_ module. However, if you're
166using Python 2.6 or later (which includes a builtin version of the module), Django will
167use the builtin ``json`` module automatically. If you have a system installed
168version that includes the C-based speedup extension, or your system version is
169more recent than the version shipped with Django (currently, 2.0.7), the
170system version will be used instead of the version included with Django.
171
172Be aware that if you're serializing using that module directly, not all Django
173output can be passed unmodified to simplejson. In particular, :ref:`lazy
174translation objects <lazy-translations>` need a `special encoder`_ written for
175them. Something like this will work::
176
177    from django.utils.functional import Promise
178    from django.utils.encoding import force_unicode
179
180    class LazyEncoder(simplejson.JSONEncoder):
181        def default(self, obj):
182            if isinstance(obj, Promise):
183                return force_unicode(obj)
184            return super(LazyEncoder, self).default(obj)
185
186.. _special encoder: http://svn.red-bean.com/bob/simplejson/tags/simplejson-1.7/docs/index.html
187
188.. _topics-serialization-natural-keys:
189
190Natural keys
191------------
192
193.. versionadded:: 1.2
194
195   The ability to use natural keys when serializing/deserializing data was
196   added in the 1.2 release.
197
198The default serialization strategy for foreign keys and many-to-many
199relations is to serialize the value of the primary key(s) of the
200objects in the relation. This strategy works well for most types of
201object, but it can cause difficulty in some circumstances.
202
203Consider the case of a list of objects that have foreign key on
204:class:`ContentType`. If you're going to serialize an object that
205refers to a content type, you need to have a way to refer to that
206content type. Content Types are automatically created by Django as
207part of the database synchronization process, so you don't need to
208include content types in a fixture or other serialized data. As a
209result, the primary key of any given content type isn't easy to
210predict - it will depend on how and when :djadmin:`syncdb` was
211executed to create the content types.
212
213There is also the matter of convenience. An integer id isn't always
214the most convenient way to refer to an object; sometimes, a
215more natural reference would be helpful.
216
217It is for these reasons that Django provides *natural keys*. A natural
218key is a tuple of values that can be used to uniquely identify an
219object instance without using the primary key value.
220
221Deserialization of natural keys
222~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
223
224Consider the following two models::
225
226    from django.db import models
227
228    class Person(models.Model):
229        first_name = models.CharField(max_length=100)
230        last_name = models.CharField(max_length=100)
231
232        birthdate = models.DateField()
233
234        class Meta:
235            unique_together = (('first_name', 'last_name'),)
236
237    class Book(models.Model):
238        name = models.CharField(max_length=100)
239        author = models.ForeignKey(Person)
240
241Ordinarily, serialized data for ``Book`` would use an integer to refer to
242the author. For example, in JSON, a Book might be serialized as::
243
244    ...
245    {
246        "pk": 1,
247        "model": "store.book",
248        "fields": {
249            "name": "Mostly Harmless",
250            "author": 42
251        }
252    }
253    ...
254
255This isn't a particularly natural way to refer to an author. It
256requires that you know the primary key value for the author; it also
257requires that this primary key value is stable and predictable.
258
259However, if we add natural key handling to Person, the fixture becomes
260much more humane. To add natural key handling, you define a default
261Manager for Person with a ``get_by_natural_key()`` method. In the case
262of a Person, a good natural key might be the pair of first and last
263name::
264
265    from django.db import models
266
267    class PersonManager(models.Manager):
268        def get_by_natural_key(self, first_name, last_name):
269            return self.get(first_name=first_name, last_name=last_name)
270
271    class Person(models.Model):
272        objects = PersonManager()
273
274        first_name = models.CharField(max_length=100)
275        last_name = models.CharField(max_length=100)
276
277        birthdate = models.DateField()
278
279        class Meta:
280            unique_together = (('first_name', 'last_name'),)
281
282Now books can use that natural key to refer to ``Person`` objects::
283
284    ...
285    {
286        "pk": 1,
287        "model": "store.book",
288        "fields": {
289            "name": "Mostly Harmless",
290            "author": ["Douglas", "Adams"]
291        }
292    }
293    ...
294
295When you try to load this serialized data, Django will use the
296``get_by_natural_key()`` method to resolve ``["Douglas", "Adams"]``
297into the primary key of an actual ``Person`` object.
298
299.. note::
300
301    Whatever fields you use for a natural key must be able to uniquely
302    identify an object. This will usually mean that your model will
303    have a uniqueness clause (either unique=True on a single field, or
304    ``unique_together`` over multiple fields) for the field or fields
305    in your natural key. However, uniqueness doesn't need to be
306    enforced at the database level. If you are certain that a set of
307    fields will be effectively unique, you can still use those fields
308    as a natural key.
309
310Serialization of natural keys
311~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
312
313So how do you get Django to emit a natural key when serializing an object?
314Firstly, you need to add another method -- this time to the model itself::
315
316    class Person(models.Model):
317        objects = PersonManager()
318
319        first_name = models.CharField(max_length=100)
320        last_name = models.CharField(max_length=100)
321
322        birthdate = models.DateField()
323
324        def natural_key(self):
325            return (self.first_name, self.last_name)
326
327        class Meta:
328            unique_together = (('first_name', 'last_name'),)
329
330That method should always return a natural key tuple -- in this
331example, ``(first name, last name)``. Then, when you call
332``serializers.serialize()``, you provide a ``use_natural_keys=True``
333argument::
334
335    >>> serializers.serialize('json', [book1, book2], indent=2, use_natural_keys=True)
336
337When ``use_natural_keys=True`` is specified, Django will use the
338``natural_key()`` method to serialize any reference to objects of the
339type that defines the method.
340
341If you are using :djadmin:`dumpdata` to generate serialized data, you
342use the `--natural` command line flag to generate natural keys.
343
344.. note::
345
346    You don't need to define both ``natural_key()`` and
347    ``get_by_natural_key()``. If you don't want Django to output
348    natural keys during serialization, but you want to retain the
349    ability to load natural keys, then you can opt to not implement
350    the ``natural_key()`` method.
351
352    Conversely, if (for some strange reason) you want Django to output
353    natural keys during serialization, but *not* be able to load those
354    key values, just don't define the ``get_by_natural_key()`` method.
355
356Dependencies during serialization
357~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
358
359Since natural keys rely on database lookups to resolve references, it
360is important that data exists before it is referenced. You can't make
361a `forward reference` with natural keys - the data you are referencing
362must exist before you include a natural key reference to that data.
363
364To accommodate this limitation, calls to :djadmin:`dumpdata` that use
365the :djadminopt:`--natural` option will serialize any model with a
366``natural_key()`` method before it serializes normal key objects.
367
368However, this may not always be enough. If your natural key refers to
369another object (by using a foreign key or natural key to another object
370as part of a natural key), then you need to be able to ensure that
371the objects on which a natural key depends occur in the serialized data
372before the natural key requires them.
373
374To control this ordering, you can define dependencies on your
375``natural_key()`` methods. You do this by setting a ``dependencies``
376attribute on the ``natural_key()`` method itself.
377
378For example, consider the ``Permission`` model in ``contrib.auth``.
379The following is a simplified version of the ``Permission`` model::
380
381    class Permission(models.Model):
382        name = models.CharField(max_length=50)
383        content_type = models.ForeignKey(ContentType)
384        codename = models.CharField(max_length=100)
385        # ...
386        def natural_key(self):
387            return (self.codename,) + self.content_type.natural_key()
388
389The natural key for a ``Permission`` is a combination of the codename for the
390``Permission``, and the ``ContentType`` to which the ``Permission`` applies. This means
391that ``ContentType`` must be serialized before ``Permission``. To define this
392dependency, we add one extra line::
393
394    class Permission(models.Model):
395        # ...
396        def natural_key(self):
397            return (self.codename,) + self.content_type.natural_key()
398        natural_key.dependencies = ['contenttypes.contenttype']
399
400This definition ensures that ``ContentType`` models are serialized before
401``Permission`` models. In turn, any object referencing ``Permission`` will
402be serialized after both ``ContentType`` and ``Permission``.