PageRenderTime 64ms CodeModel.GetById 17ms app.highlight 34ms RepoModel.GetById 1ms app.codeStats 1ms

ReStructuredText | 888 lines | 643 code | 245 blank | 0 comment | 0 complexity | 931d707ad4b3334b353d55615e4970d7 MD5 | raw file
  1:mod:`pickle` --- Python object serialization
  4.. index::
  5   single: persistence
  6   pair: persistent; objects
  7   pair: serializing; objects
  8   pair: marshalling; objects
  9   pair: flattening; objects
 10   pair: pickling; objects
 12.. module:: pickle
 13   :synopsis: Convert Python objects to streams of bytes and back.
 14.. sectionauthor:: Jim Kerr <>.
 15.. sectionauthor:: Barry Warsaw <>
 17The :mod:`pickle` module implements a fundamental, but powerful algorithm for
 18serializing and de-serializing a Python object structure.  "Pickling" is the
 19process whereby a Python object hierarchy is converted into a byte stream, and
 20"unpickling" is the inverse operation, whereby a byte stream is converted back
 21into an object hierarchy.  Pickling (and unpickling) is alternatively known as
 22"serialization", "marshalling," [#]_ or "flattening", however, to avoid
 23confusion, the terms used here are "pickling" and "unpickling".
 25This documentation describes both the :mod:`pickle` module and the
 26:mod:`cPickle` module.
 29Relationship to other Python modules
 32The :mod:`pickle` module has an optimized cousin called the :mod:`cPickle`
 33module.  As its name implies, :mod:`cPickle` is written in C, so it can be up to
 341000 times faster than :mod:`pickle`.  However it does not support subclassing
 35of the :func:`Pickler` and :func:`Unpickler` classes, because in :mod:`cPickle`
 36these are functions, not classes.  Most applications have no need for this
 37functionality, and can benefit from the improved performance of :mod:`cPickle`.
 38Other than that, the interfaces of the two modules are nearly identical; the
 39common interface is described in this manual and differences are pointed out
 40where necessary.  In the following discussions, we use the term "pickle" to
 41collectively describe the :mod:`pickle` and :mod:`cPickle` modules.
 43The data streams the two modules produce are guaranteed to be interchangeable.
 45Python has a more primitive serialization module called :mod:`marshal`, but in
 46general :mod:`pickle` should always be the preferred way to serialize Python
 47objects.  :mod:`marshal` exists primarily to support Python's :file:`.pyc`
 50The :mod:`pickle` module differs from :mod:`marshal` several significant ways:
 52* The :mod:`pickle` module keeps track of the objects it has already serialized,
 53  so that later references to the same object won't be serialized again.
 54  :mod:`marshal` doesn't do this.
 56  This has implications both for recursive objects and object sharing.  Recursive
 57  objects are objects that contain references to themselves.  These are not
 58  handled by marshal, and in fact, attempting to marshal recursive objects will
 59  crash your Python interpreter.  Object sharing happens when there are multiple
 60  references to the same object in different places in the object hierarchy being
 61  serialized.  :mod:`pickle` stores such objects only once, and ensures that all
 62  other references point to the master copy.  Shared objects remain shared, which
 63  can be very important for mutable objects.
 65* :mod:`marshal` cannot be used to serialize user-defined classes and their
 66  instances.  :mod:`pickle` can save and restore class instances transparently,
 67  however the class definition must be importable and live in the same module as
 68  when the object was stored.
 70* The :mod:`marshal` serialization format is not guaranteed to be portable
 71  across Python versions.  Because its primary job in life is to support
 72  :file:`.pyc` files, the Python implementers reserve the right to change the
 73  serialization format in non-backwards compatible ways should the need arise.
 74  The :mod:`pickle` serialization format is guaranteed to be backwards compatible
 75  across Python releases.
 77.. warning::
 79   The :mod:`pickle` module is not intended to be secure against erroneous or
 80   maliciously constructed data.  Never unpickle data received from an untrusted
 81   or unauthenticated source.
 83Note that serialization is a more primitive notion than persistence; although
 84:mod:`pickle` reads and writes file objects, it does not handle the issue of
 85naming persistent objects, nor the (even more complicated) issue of concurrent
 86access to persistent objects.  The :mod:`pickle` module can transform a complex
 87object into a byte stream and it can transform the byte stream into an object
 88with the same internal structure.  Perhaps the most obvious thing to do with
 89these byte streams is to write them onto a file, but it is also conceivable to
 90send them across a network or store them in a database.  The module
 91:mod:`shelve` provides a simple interface to pickle and unpickle objects on
 92DBM-style database files.
 95Data stream format
 98.. index::
 99   single: XDR
100   single: External Data Representation
102The data format used by :mod:`pickle` is Python-specific.  This has the
103advantage that there are no restrictions imposed by external standards such as
104XDR (which can't represent pointer sharing); however it means that non-Python
105programs may not be able to reconstruct pickled Python objects.
107By default, the :mod:`pickle` data format uses a printable ASCII representation.
108This is slightly more voluminous than a binary representation.  The big
109advantage of using printable ASCII (and of some other characteristics of
110:mod:`pickle`'s representation) is that for debugging or recovery purposes it is
111possible for a human to read the pickled file with a standard text editor.
113There are currently 3 different protocols which can be used for pickling.
115* Protocol version 0 is the original ASCII protocol and is backwards compatible
116  with earlier versions of Python.
118* Protocol version 1 is the old binary format which is also compatible with
119  earlier versions of Python.
121* Protocol version 2 was introduced in Python 2.3.  It provides much more
122  efficient pickling of :term:`new-style class`\es.
124Refer to :pep:`307` for more information.
126If a *protocol* is not specified, protocol 0 is used. If *protocol* is specified
127as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol version
128available will be used.
130.. versionchanged:: 2.3
131   Introduced the *protocol* parameter.
133A binary format, which is slightly more efficient, can be chosen by specifying a
134*protocol* version >= 1.
140To serialize an object hierarchy, you first create a pickler, then you call the
141pickler's :meth:`dump` method.  To de-serialize a data stream, you first create
142an unpickler, then you call the unpickler's :meth:`load` method.  The
143:mod:`pickle` module provides the following constant:
148   The highest protocol version available.  This value can be passed as a
149   *protocol* value.
151   .. versionadded:: 2.3
153.. note::
155   Be sure to always open pickle files created with protocols >= 1 in binary mode.
156   For the old ASCII-based pickle protocol 0 you can use either text mode or binary
157   mode as long as you stay consistent.
159   A pickle file written with protocol 0 in binary mode will contain lone linefeeds
160   as line terminators and therefore will look "funny" when viewed in Notepad or
161   other editors which do not support this format.
163The :mod:`pickle` module provides the following functions to make the pickling
164process more convenient:
167.. function:: dump(obj, file[, protocol])
169   Write a pickled representation of *obj* to the open file object *file*.  This is
170   equivalent to ``Pickler(file, protocol).dump(obj)``.
172   If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
173   specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol
174   version will be used.
176   .. versionchanged:: 2.3
177      Introduced the *protocol* parameter.
179   *file* must have a :meth:`write` method that accepts a single string argument.
180   It can thus be a file object opened for writing, a :mod:`StringIO` object, or
181   any other custom object that meets this interface.
184.. function:: load(file)
186   Read a string from the open file object *file* and interpret it as a pickle data
187   stream, reconstructing and returning the original object hierarchy.  This is
188   equivalent to ``Unpickler(file).load()``.
190   *file* must have two methods, a :meth:`read` method that takes an integer
191   argument, and a :meth:`readline` method that requires no arguments.  Both
192   methods should return a string.  Thus *file* can be a file object opened for
193   reading, a :mod:`StringIO` object, or any other custom object that meets this
194   interface.
196   This function automatically determines whether the data stream was written in
197   binary mode or not.
200.. function:: dumps(obj[, protocol])
202   Return the pickled representation of the object as a string, instead of writing
203   it to a file.
205   If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
206   specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol
207   version will be used.
209   .. versionchanged:: 2.3
210      The *protocol* parameter was added.
213.. function:: loads(string)
215   Read a pickled object hierarchy from a string.  Characters in the string past
216   the pickled object's representation are ignored.
218The :mod:`pickle` module also defines three exceptions:
221.. exception:: PickleError
223   A common base class for the other exceptions defined below.  This inherits from
224   :exc:`Exception`.
227.. exception:: PicklingError
229   This exception is raised when an unpicklable object is passed to the
230   :meth:`dump` method.
233.. exception:: UnpicklingError
235   This exception is raised when there is a problem unpickling an object. Note that
236   other exceptions may also be raised during unpickling, including (but not
237   necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`,
238   :exc:`ImportError`, and :exc:`IndexError`.
240The :mod:`pickle` module also exports two callables [#]_, :class:`Pickler` and
244.. class:: Pickler(file[, protocol])
246   This takes a file-like object to which it will write a pickle data stream.
248   If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
249   specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
250   protocol version will be used.
252   .. versionchanged:: 2.3
253      Introduced the *protocol* parameter.
255   *file* must have a :meth:`write` method that accepts a single string argument.
256   It can thus be an open file object, a :mod:`StringIO` object, or any other
257   custom object that meets this interface.
259   :class:`Pickler` objects define one (or two) public methods:
262   .. method:: dump(obj)
264      Write a pickled representation of *obj* to the open file object given in the
265      constructor.  Either the binary or ASCII format will be used, depending on the
266      value of the *protocol* argument passed to the constructor.
269   .. method:: clear_memo()
271      Clears the pickler's "memo".  The memo is the data structure that remembers
272      which objects the pickler has already seen, so that shared or recursive objects
273      pickled by reference and not by value.  This method is useful when re-using
274      picklers.
276      .. note::
278         Prior to Python 2.3, :meth:`clear_memo` was only available on the picklers
279         created by :mod:`cPickle`.  In the :mod:`pickle` module, picklers have an
280         instance variable called :attr:`memo` which is a Python dictionary.  So to clear
281         the memo for a :mod:`pickle` module pickler, you could do the following::
283            mypickler.memo.clear()
285         Code that does not need to support older versions of Python should simply use
286         :meth:`clear_memo`.
288It is possible to make multiple calls to the :meth:`dump` method of the same
289:class:`Pickler` instance.  These must then be matched to the same number of
290calls to the :meth:`load` method of the corresponding :class:`Unpickler`
291instance.  If the same object is pickled by multiple :meth:`dump` calls, the
292:meth:`load` will all yield references to the same object. [#]_
294:class:`Unpickler` objects are defined as:
297.. class:: Unpickler(file)
299   This takes a file-like object from which it will read a pickle data stream.
300   This class automatically determines whether the data stream was written in
301   binary mode or not, so it does not need a flag as in the :class:`Pickler`
302   factory.
304   *file* must have two methods, a :meth:`read` method that takes an integer
305   argument, and a :meth:`readline` method that requires no arguments.  Both
306   methods should return a string.  Thus *file* can be a file object opened for
307   reading, a :mod:`StringIO` object, or any other custom object that meets this
308   interface.
310   :class:`Unpickler` objects have one (or two) public methods:
313   .. method:: load()
315      Read a pickled object representation from the open file object given in
316      the constructor, and return the reconstituted object hierarchy specified
317      therein.
319      This method automatically determines whether the data stream was written
320      in binary mode or not.
323   .. method:: noload()
325      This is just like :meth:`load` except that it doesn't actually create any
326      objects.  This is useful primarily for finding what's called "persistent
327      ids" that may be referenced in a pickle data stream.  See section
328      :ref:`pickle-protocol` below for more details.
330      **Note:** the :meth:`noload` method is currently only available on
331      :class:`Unpickler` objects created with the :mod:`cPickle` module.
332      :mod:`pickle` module :class:`Unpickler`\ s do not have the :meth:`noload`
333      method.
336What can be pickled and unpickled?
339The following types can be pickled:
341* ``None``, ``True``, and ``False``
343* integers, long integers, floating point numbers, complex numbers
345* normal and Unicode strings
347* tuples, lists, sets, and dictionaries containing only picklable objects
349* functions defined at the top level of a module
351* built-in functions defined at the top level of a module
353* classes that are defined at the top level of a module
355* instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is
356  picklable  (see section :ref:`pickle-protocol` for details)
358Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
359exception; when this happens, an unspecified number of bytes may have already
360been written to the underlying file. Trying to pickle a highly recursive data
361structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be
362raised in this case. You can carefully raise this limit with
365Note that functions (built-in and user-defined) are pickled by "fully qualified"
366name reference, not by value.  This means that only the function name is
367pickled, along with the name of module the function is defined in.  Neither the
368function's code, nor any of its function attributes are pickled.  Thus the
369defining module must be importable in the unpickling environment, and the module
370must contain the named object, otherwise an exception will be raised. [#]_
372Similarly, classes are pickled by named reference, so the same restrictions in
373the unpickling environment apply.  Note that none of the class's code or data is
374pickled, so in the following example the class attribute ``attr`` is not
375restored in the unpickling environment::
377   class Foo:
378       attr = 'a class attr'
380   picklestring = pickle.dumps(Foo)
382These restrictions are why picklable functions and classes must be defined in
383the top level of a module.
385Similarly, when class instances are pickled, their class's code and data are not
386pickled along with them.  Only the instance data are pickled.  This is done on
387purpose, so you can fix bugs in a class or add methods to the class and still
388load objects that were created with an earlier version of the class.  If you
389plan to have long-lived objects that will see many versions of a class, it may
390be worthwhile to put a version number in the objects so that suitable
391conversions can be made by the class's :meth:`__setstate__` method.
394.. _pickle-protocol:
396The pickle protocol
399.. currentmodule:: None
401This section describes the "pickling protocol" that defines the interface
402between the pickler/unpickler and the objects that are being serialized.  This
403protocol provides a standard way for you to define, customize, and control how
404your objects are serialized and de-serialized.  The description in this section
405doesn't cover specific customizations that you can employ to make the unpickling
406environment slightly safer from untrusted pickle data streams; see section
407:ref:`pickle-sub` for more details.
410.. _pickle-inst:
412Pickling and unpickling normal class instances
415.. method:: object.__getinitargs__()
417   When a pickled class instance is unpickled, its :meth:`__init__` method is
418   normally *not* invoked.  If it is desirable that the :meth:`__init__` method
419   be called on unpickling, an old-style class can define a method
420   :meth:`__getinitargs__`, which should return a *tuple* containing the
421   arguments to be passed to the class constructor (:meth:`__init__` for
422   example).  The :meth:`__getinitargs__` method is called at pickle time; the
423   tuple it returns is incorporated in the pickle for the instance.
425.. method:: object.__getnewargs__()
427   New-style types can provide a :meth:`__getnewargs__` method that is used for
428   protocol 2.  Implementing this method is needed if the type establishes some
429   internal invariants when the instance is created, or if the memory allocation
430   is affected by the values passed to the :meth:`__new__` method for the type
431   (as it is for tuples and strings).  Instances of a :term:`new-style class`
432   ``C`` are created using ::
434      obj = C.__new__(C, *args)
436   where *args* is the result of calling :meth:`__getnewargs__` on the original
437   object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed.
439.. method:: object.__getstate__()
441   Classes can further influence how their instances are pickled; if the class
442   defines the method :meth:`__getstate__`, it is called and the return state is
443   pickled as the contents for the instance, instead of the contents of the
444   instance's dictionary.  If there is no :meth:`__getstate__` method, the
445   instance's :attr:`__dict__` is pickled.
447.. method:: object.__setstate__()
449   Upon unpickling, if the class also defines the method :meth:`__setstate__`,
450   it is called with the unpickled state. [#]_ If there is no
451   :meth:`__setstate__` method, the pickled state must be a dictionary and its
452   items are assigned to the new instance's dictionary.  If a class defines both
453   :meth:`__getstate__` and :meth:`__setstate__`, the state object needn't be a
454   dictionary and these methods can do what they want. [#]_
456   .. note::
458      For :term:`new-style class`\es, if :meth:`__getstate__` returns a false
459      value, the :meth:`__setstate__` method will not be called.
461.. note::
463   At unpickling time, some methods like :meth:`__getattr__`,
464   :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
465   instance.  In case those methods rely on some internal invariant being
466   true, the type should implement either :meth:`__getinitargs__` or
467   :meth:`__getnewargs__` to establish such an invariant; otherwise, neither
468   :meth:`__new__` nor :meth:`__init__` will be called.
471Pickling and unpickling extension types
474.. method:: object.__reduce__()
476   When the :class:`Pickler` encounters an object of a type it knows nothing
477   about --- such as an extension type --- it looks in two places for a hint of
478   how to pickle it.  One alternative is for the object to implement a
479   :meth:`__reduce__` method.  If provided, at pickling time :meth:`__reduce__`
480   will be called with no arguments, and it must return either a string or a
481   tuple.
483   If a string is returned, it names a global variable whose contents are
484   pickled as normal.  The string returned by :meth:`__reduce__` should be the
485   object's local name relative to its module; the pickle module searches the
486   module namespace to determine the object's module.
488   When a tuple is returned, it must be between two and five elements long.
489   Optional elements can either be omitted, or ``None`` can be provided as their
490   value.  The contents of this tuple are pickled as normal and used to
491   reconstruct the object at unpickling time.  The semantics of each element
492   are:
494   * A callable object that will be called to create the initial version of the
495     object.  The next element of the tuple will provide arguments for this
496     callable, and later elements provide additional state information that will
497     subsequently be used to fully reconstruct the pickled data.
499     In the unpickling environment this object must be either a class, a
500     callable registered as a "safe constructor" (see below), or it must have an
501     attribute :attr:`__safe_for_unpickling__` with a true value. Otherwise, an
502     :exc:`UnpicklingError` will be raised in the unpickling environment.  Note
503     that as usual, the callable itself is pickled by name.
505   * A tuple of arguments for the callable object.
507     .. versionchanged:: 2.5
508        Formerly, this argument could also be ``None``.
510   * Optionally, the object's state, which will be passed to the object's
511     :meth:`__setstate__` method as described in section :ref:`pickle-inst`.  If
512     the object has no :meth:`__setstate__` method, then, as above, the value
513     must be a dictionary and it will be added to the object's :attr:`__dict__`.
515   * Optionally, an iterator (and not a sequence) yielding successive list
516     items.  These list items will be pickled, and appended to the object using
517     either ``obj.append(item)`` or ``obj.extend(list_of_items)``.  This is
518     primarily used for list subclasses, but may be used by other classes as
519     long as they have :meth:`append` and :meth:`extend` methods with the
520     appropriate signature.  (Whether :meth:`append` or :meth:`extend` is used
521     depends on which pickle protocol version is used as well as the number of
522     items to append, so both must be supported.)
524   * Optionally, an iterator (not a sequence) yielding successive dictionary
525     items, which should be tuples of the form ``(key, value)``.  These items
526     will be pickled and stored to the object using ``obj[key] = value``. This
527     is primarily used for dictionary subclasses, but may be used by other
528     classes as long as they implement :meth:`__setitem__`.
530.. method:: object.__reduce_ex__(protocol)
532   It is sometimes useful to know the protocol version when implementing
533   :meth:`__reduce__`.  This can be done by implementing a method named
534   :meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`,
535   when it exists, is called in preference over :meth:`__reduce__` (you may
536   still provide :meth:`__reduce__` for backwards compatibility).  The
537   :meth:`__reduce_ex__` method will be called with a single integer argument,
538   the protocol version.
540   The :class:`object` class implements both :meth:`__reduce__` and
541   :meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__`
542   but not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation
543   detects this and calls :meth:`__reduce__`.
545An alternative to implementing a :meth:`__reduce__` method on the object to be
546pickled, is to register the callable with the :mod:`copy_reg` module.  This
547module provides a way for programs to register "reduction functions" and
548constructors for user-defined types.   Reduction functions have the same
549semantics and interface as the :meth:`__reduce__` method described above, except
550that they are called with a single argument, the object to be pickled.
552The registered constructor is deemed a "safe constructor" for purposes of
553unpickling as described above.
556Pickling and unpickling external objects
559.. index::
560   single: persistent_id (pickle protocol)
561   single: persistent_load (pickle protocol)
563For the benefit of object persistence, the :mod:`pickle` module supports the
564notion of a reference to an object outside the pickled data stream.  Such
565objects are referenced by a "persistent id", which is just an arbitrary string
566of printable ASCII characters. The resolution of such names is not defined by
567the :mod:`pickle` module; it will delegate this resolution to user defined
568functions on the pickler and unpickler. [#]_
570To define external persistent id resolution, you need to set the
571:attr:`persistent_id` attribute of the pickler object and the
572:attr:`persistent_load` attribute of the unpickler object.
574To pickle objects that have an external persistent id, the pickler must have a
575custom :func:`persistent_id` method that takes an object as an argument and
576returns either ``None`` or the persistent id for that object.  When ``None`` is
577returned, the pickler simply pickles the object as normal.  When a persistent id
578string is returned, the pickler will pickle that string, along with a marker so
579that the unpickler will recognize the string as a persistent id.
581To unpickle external objects, the unpickler must have a custom
582:func:`persistent_load` function that takes a persistent id string and returns
583the referenced object.
585Here's a silly example that *might* shed more light::
587   import pickle
588   from cStringIO import StringIO
590   src = StringIO()
591   p = pickle.Pickler(src)
593   def persistent_id(obj):
594       if hasattr(obj, 'x'):
595           return 'the value %d' % obj.x
596       else:
597           return None
599   p.persistent_id = persistent_id
601   class Integer:
602       def __init__(self, x):
603           self.x = x
604       def __str__(self):
605           return 'My name is integer %d' % self.x
607   i = Integer(7)
608   print i
609   p.dump(i)
611   datastream = src.getvalue()
612   print repr(datastream)
613   dst = StringIO(datastream)
615   up = pickle.Unpickler(dst)
617   class FancyInteger(Integer):
618       def __str__(self):
619           return 'I am the integer %d' % self.x
621   def persistent_load(persid):
622       if persid.startswith('the value '):
623           value = int(persid.split()[2])
624           return FancyInteger(value)
625       else:
626           raise pickle.UnpicklingError, 'Invalid persistent id'
628   up.persistent_load = persistent_load
630   j = up.load()
631   print j
633In the :mod:`cPickle` module, the unpickler's :attr:`persistent_load` attribute
634can also be set to a Python list, in which case, when the unpickler reaches a
635persistent id, the persistent id string will simply be appended to this list.
636This functionality exists so that a pickle data stream can be "sniffed" for
637object references without actually instantiating all the objects in a pickle.
638[#]_  Setting :attr:`persistent_load` to a list is usually used in conjunction
639with the :meth:`noload` method on the Unpickler.
641.. BAW: Both pickle and cPickle support something called inst_persistent_id()
642   which appears to give unknown types a second shot at producing a persistent
643   id.  Since Jim Fulton can't remember why it was added or what it's for, I'm
644   leaving it undocumented.
647.. _pickle-sub:
649Subclassing Unpicklers
652.. index::
653   single: load_global() (pickle protocol)
654   single: find_global() (pickle protocol)
656By default, unpickling will import any class that it finds in the pickle data.
657You can control exactly what gets unpickled and what gets called by customizing
658your unpickler.  Unfortunately, exactly how you do this is different depending
659on whether you're using :mod:`pickle` or :mod:`cPickle`. [#]_
661In the :mod:`pickle` module, you need to derive a subclass from
662:class:`Unpickler`, overriding the :meth:`load_global` method.
663:meth:`load_global` should read two lines from the pickle data stream where the
664first line will the name of the module containing the class and the second line
665will be the name of the instance's class.  It then looks up the class, possibly
666importing the module and digging out the attribute, then it appends what it
667finds to the unpickler's stack.  Later on, this class will be assigned to the
668:attr:`__class__` attribute of an empty class, as a way of magically creating an
669instance without calling its class's :meth:`__init__`. Your job (should you
670choose to accept it), would be to have :meth:`load_global` push onto the
671unpickler's stack, a known safe version of any class you deem safe to unpickle.
672It is up to you to produce such a class.  Or you could raise an error if you
673want to disallow all unpickling of instances.  If this sounds like a hack,
674you're right.  Refer to the source code to make this work.
676Things are a little cleaner with :mod:`cPickle`, but not by much. To control
677what gets unpickled, you can set the unpickler's :attr:`find_global` attribute
678to a function or ``None``.  If it is ``None`` then any attempts to unpickle
679instances will raise an :exc:`UnpicklingError`.  If it is a function, then it
680should accept a module name and a class name, and return the corresponding class
681object.  It is responsible for looking up the class and performing any necessary
682imports, and it may raise an error to prevent instances of the class from being
685The moral of the story is that you should be really careful about the source of
686the strings your application unpickles.
689.. _pickle-example:
694For the simplest code, use the :func:`dump` and :func:`load` functions.  Note
695that a self-referencing list is pickled and restored correctly. ::
697   import pickle
699   data1 = {'a': [1, 2.0, 3, 4+6j],
700            'b': ('string', u'Unicode string'),
701            'c': None}
703   selfref_list = [1, 2, 3]
704   selfref_list.append(selfref_list)
706   output = open('data.pkl', 'wb')
708   # Pickle dictionary using protocol 0.
709   pickle.dump(data1, output)
711   # Pickle the list using the highest protocol available.
712   pickle.dump(selfref_list, output, -1)
714   output.close()
716The following example reads the resulting pickled data.  When reading a
717pickle-containing file, you should open the file in binary mode because you
718can't be sure if the ASCII or binary format was used. ::
720   import pprint, pickle
722   pkl_file = open('data.pkl', 'rb')
724   data1 = pickle.load(pkl_file)
725   pprint.pprint(data1)
727   data2 = pickle.load(pkl_file)
728   pprint.pprint(data2)
730   pkl_file.close()
732Here's a larger example that shows how to modify pickling behavior for a class.
733The :class:`TextReader` class opens a text file, and returns the line number and
734line contents each time its :meth:`readline` method is called. If a
735:class:`TextReader` instance is pickled, all attributes *except* the file object
736member are saved. When the instance is unpickled, the file is reopened, and
737reading resumes from the last location. The :meth:`__setstate__` and
738:meth:`__getstate__` methods are used to implement this behavior. ::
740   #!/usr/local/bin/python
742   class TextReader:
743       """Print and number lines in a text file."""
744       def __init__(self, file):
745           self.file = file
746           self.fh = open(file)
747           self.lineno = 0
749       def readline(self):
750           self.lineno = self.lineno + 1
751           line = self.fh.readline()
752           if not line:
753               return None
754           if line.endswith("\n"):
755               line = line[:-1]
756           return "%d: %s" % (self.lineno, line)
758       def __getstate__(self):
759           odict = self.__dict__.copy() # copy the dict since we change it
760           del odict['fh']              # remove filehandle entry
761           return odict
763       def __setstate__(self, dict):
764           fh = open(dict['file'])      # reopen file
765           count = dict['lineno']       # read from file...
766           while count:                 # until line count is restored
767               fh.readline()
768               count = count - 1
769           self.__dict__.update(dict)   # update attributes
770           self.fh = fh                 # save the file object
772A sample usage might be something like this::
774   >>> import TextReader
775   >>> obj = TextReader.TextReader("")
776   >>> obj.readline()
777   '1: #!/usr/local/bin/python'
778   >>> obj.readline()
779   '2: '
780   >>> obj.readline()
781   '3: class TextReader:'
782   >>> import pickle
783   >>> pickle.dump(obj, open('save.p', 'wb'))
785If you want to see that :mod:`pickle` works across Python processes, start
786another Python session, before continuing.  What follows can happen from either
787the same process or a new process. ::
789   >>> import pickle
790   >>> reader = pickle.load(open('save.p', 'rb'))
791   >>> reader.readline()
792   '4:     """Print and number lines in a text file."""'
795.. seealso::
797   Module :mod:`copy_reg`
798      Pickle interface constructor registration for extension types.
800   Module :mod:`shelve`
801      Indexed databases of objects; uses :mod:`pickle`.
803   Module :mod:`copy`
804      Shallow and deep object copying.
806   Module :mod:`marshal`
807      High-performance serialization of built-in types.
810:mod:`cPickle` --- A faster :mod:`pickle`
813.. module:: cPickle
814   :synopsis: Faster version of pickle, but not subclassable.
815.. moduleauthor:: Jim Fulton <>
816.. sectionauthor:: Fred L. Drake, Jr. <>
819.. index:: module: pickle
821The :mod:`cPickle` module supports serialization and de-serialization of Python
822objects, providing an interface and functionality nearly identical to the
823:mod:`pickle` module.  There are several differences, the most important being
824performance and subclassability.
826First, :mod:`cPickle` can be up to 1000 times faster than :mod:`pickle` because
827the former is implemented in C.  Second, in the :mod:`cPickle` module the
828callables :func:`Pickler` and :func:`Unpickler` are functions, not classes.
829This means that you cannot use them to derive custom pickling and unpickling
830subclasses.  Most applications have no need for this functionality and should
831benefit from the greatly improved performance of the :mod:`cPickle` module.
833The pickle data stream produced by :mod:`pickle` and :mod:`cPickle` are
834identical, so it is possible to use :mod:`pickle` and :mod:`cPickle`
835interchangeably with existing pickles. [#]_
837There are additional minor differences in API between :mod:`cPickle` and
838:mod:`pickle`, however for most applications, they are interchangeable.  More
839documentation is provided in the :mod:`pickle` module documentation, which
840includes a list of the documented differences.
842.. rubric:: Footnotes
844.. [#] Don't confuse this with the :mod:`marshal` module
846.. [#] In the :mod:`pickle` module these callables are classes, which you could
847   subclass to customize the behavior.  However, in the :mod:`cPickle` module these
848   callables are factory functions and so cannot be subclassed.  One common reason
849   to subclass is to control what objects can actually be unpickled.  See section
850   :ref:`pickle-sub` for more details.
852.. [#] *Warning*: this is intended for pickling multiple objects without intervening
853   modifications to the objects or their parts.  If you modify an object and then
854   pickle it again using the same :class:`Pickler` instance, the object is not
855   pickled again --- a reference to it is pickled and the :class:`Unpickler` will
856   return the old value, not the modified one. There are two problems here: (1)
857   detecting changes, and (2) marshalling a minimal set of changes.  Garbage
858   Collection may also become a problem here.
860.. [#] The exception raised will likely be an :exc:`ImportError` or an
861   :exc:`AttributeError` but it could be something else.
863.. [#] These methods can also be used to implement copying class instances.
865.. [#] This protocol is also used by the shallow and deep copying operations defined in
866   the :mod:`copy` module.
868.. [#] The actual mechanism for associating these user defined functions is slightly
869   different for :mod:`pickle` and :mod:`cPickle`.  The description given here
870   works the same for both implementations.  Users of the :mod:`pickle` module
871   could also use subclassing to effect the same results, overriding the
872   :meth:`persistent_id` and :meth:`persistent_load` methods in the derived
873   classes.
875.. [#] We'll leave you with the image of Guido and Jim sitting around sniffing pickles
876   in their living rooms.
878.. [#] A word of caution: the mechanisms described here use internal attributes and
879   methods, which are subject to change in future versions of Python.  We intend to
880   someday provide a common interface for controlling this behavior, which will
881   work in either :mod:`pickle` or :mod:`cPickle`.
883.. [#] Since the pickle data format is actually a tiny stack-oriented programming
884   language, and some freedom is taken in the encodings of certain objects, it is
885   possible that the two modules produce different data streams for the same input
886   objects.  However it is guaranteed that they will always be able to read each
887   other's data streams.