PageRenderTime 158ms CodeModel.GetById 103ms app.highlight 48ms RepoModel.GetById 1ms app.codeStats 0ms

/Doc/library/tarfile.rst

http://unladen-swallow.googlecode.com/
ReStructuredText | 736 lines | 453 code | 283 blank | 0 comment | 0 complexity | 3f3addc6840bd5086e27da3eb344ad90 MD5 | raw file
  1.. _tarfile-mod:
  2
  3:mod:`tarfile` --- Read and write tar archive files
  4===================================================
  5
  6.. module:: tarfile
  7   :synopsis: Read and write tar-format archive files.
  8
  9
 10.. versionadded:: 2.3
 11
 12.. moduleauthor:: Lars Gust́±„bel <lars@gustaebel.de>
 13.. sectionauthor:: Lars Gust́±„bel <lars@gustaebel.de>
 14
 15
 16The :mod:`tarfile` module makes it possible to read and write tar
 17archives, including those using gzip or bz2 compression.
 18(:file:`.zip` files can be read and written using the :mod:`zipfile` module.)
 19
 20Some facts and figures:
 21
 22* reads and writes :mod:`gzip` and :mod:`bz2` compressed archives.
 23
 24* read/write support for the POSIX.1-1988 (ustar) format.
 25
 26* read/write support for the GNU tar format including *longname* and *longlink*
 27  extensions, read-only support for the *sparse* extension.
 28
 29* read/write support for the POSIX.1-2001 (pax) format.
 30
 31  .. versionadded:: 2.6
 32
 33* handles directories, regular files, hardlinks, symbolic links, fifos,
 34  character devices and block devices and is able to acquire and restore file
 35  information like timestamp, access permissions and owner.
 36
 37
 38.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, \*\*kwargs)
 39
 40   Return a :class:`TarFile` object for the pathname *name*. For detailed
 41   information on :class:`TarFile` objects and the keyword arguments that are
 42   allowed, see :ref:`tarfile-objects`.
 43
 44   *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults
 45   to ``'r'``. Here is a full list of mode combinations:
 46
 47   +------------------+---------------------------------------------+
 48   | mode             | action                                      |
 49   +==================+=============================================+
 50   | ``'r' or 'r:*'`` | Open for reading with transparent           |
 51   |                  | compression (recommended).                  |
 52   +------------------+---------------------------------------------+
 53   | ``'r:'``         | Open for reading exclusively without        |
 54   |                  | compression.                                |
 55   +------------------+---------------------------------------------+
 56   | ``'r:gz'``       | Open for reading with gzip compression.     |
 57   +------------------+---------------------------------------------+
 58   | ``'r:bz2'``      | Open for reading with bzip2 compression.    |
 59   +------------------+---------------------------------------------+
 60   | ``'a' or 'a:'``  | Open for appending with no compression. The |
 61   |                  | file is created if it does not exist.       |
 62   +------------------+---------------------------------------------+
 63   | ``'w' or 'w:'``  | Open for uncompressed writing.              |
 64   +------------------+---------------------------------------------+
 65   | ``'w:gz'``       | Open for gzip compressed writing.           |
 66   +------------------+---------------------------------------------+
 67   | ``'w:bz2'``      | Open for bzip2 compressed writing.          |
 68   +------------------+---------------------------------------------+
 69
 70   Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If *mode* is not suitable
 71   to open a certain (compressed) file for reading, :exc:`ReadError` is raised. Use
 72   *mode* ``'r'`` to avoid this.  If a compression method is not supported,
 73   :exc:`CompressionError` is raised.
 74
 75   If *fileobj* is specified, it is used as an alternative to a file object opened
 76   for *name*. It is supposed to be at position 0.
 77
 78   For special purposes, there is a second format for *mode*:
 79   ``'filemode|[compression]'``.  :func:`tarfile.open` will return a :class:`TarFile`
 80   object that processes its data as a stream of blocks.  No random seeking will
 81   be done on the file. If given, *fileobj* may be any object that has a
 82   :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize*
 83   specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant
 84   in combination with e.g. ``sys.stdin``, a socket file object or a tape
 85   device. However, such a :class:`TarFile` object is limited in that it does
 86   not allow to be accessed randomly, see :ref:`tar-examples`.  The currently
 87   possible modes:
 88
 89   +-------------+--------------------------------------------+
 90   | Mode        | Action                                     |
 91   +=============+============================================+
 92   | ``'r|*'``   | Open a *stream* of tar blocks for reading  |
 93   |             | with transparent compression.              |
 94   +-------------+--------------------------------------------+
 95   | ``'r|'``    | Open a *stream* of uncompressed tar blocks |
 96   |             | for reading.                               |
 97   +-------------+--------------------------------------------+
 98   | ``'r|gz'``  | Open a gzip compressed *stream* for        |
 99   |             | reading.                                   |
100   +-------------+--------------------------------------------+
101   | ``'r|bz2'`` | Open a bzip2 compressed *stream* for       |
102   |             | reading.                                   |
103   +-------------+--------------------------------------------+
104   | ``'w|'``    | Open an uncompressed *stream* for writing. |
105   +-------------+--------------------------------------------+
106   | ``'w|gz'``  | Open an gzip compressed *stream* for       |
107   |             | writing.                                   |
108   +-------------+--------------------------------------------+
109   | ``'w|bz2'`` | Open an bzip2 compressed *stream* for      |
110   |             | writing.                                   |
111   +-------------+--------------------------------------------+
112
113
114.. class:: TarFile
115
116   Class for reading and writing tar archives. Do not use this class directly,
117   better use :func:`tarfile.open` instead. See :ref:`tarfile-objects`.
118
119
120.. function:: is_tarfile(name)
121
122   Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile`
123   module can read.
124
125
126.. class:: TarFileCompat(filename, mode='r', compression=TAR_PLAIN)
127
128   Class for limited access to tar archives with a :mod:`zipfile`\ -like interface.
129   Please consult the documentation of the :mod:`zipfile` module for more details.
130   *compression* must be one of the following constants:
131
132
133   .. data:: TAR_PLAIN
134
135      Constant for an uncompressed tar archive.
136
137
138   .. data:: TAR_GZIPPED
139
140      Constant for a :mod:`gzip` compressed tar archive.
141
142
143   .. deprecated:: 2.6
144      The :class:`TarFileCompat` class has been deprecated for removal in Python 3.0.
145
146
147.. exception:: TarError
148
149   Base class for all :mod:`tarfile` exceptions.
150
151
152.. exception:: ReadError
153
154   Is raised when a tar archive is opened, that either cannot be handled by the
155   :mod:`tarfile` module or is somehow invalid.
156
157
158.. exception:: CompressionError
159
160   Is raised when a compression method is not supported or when the data cannot be
161   decoded properly.
162
163
164.. exception:: StreamError
165
166   Is raised for the limitations that are typical for stream-like :class:`TarFile`
167   objects.
168
169
170.. exception:: ExtractError
171
172   Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if
173   :attr:`TarFile.errorlevel`\ ``== 2``.
174
175
176.. exception:: HeaderError
177
178   Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid.
179
180   .. versionadded:: 2.6
181
182
183Each of the following constants defines a tar archive format that the
184:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for
185details.
186
187
188.. data:: USTAR_FORMAT
189
190   POSIX.1-1988 (ustar) format.
191
192
193.. data:: GNU_FORMAT
194
195   GNU tar format.
196
197
198.. data:: PAX_FORMAT
199
200   POSIX.1-2001 (pax) format.
201
202
203.. data:: DEFAULT_FORMAT
204
205   The default format for creating archives. This is currently :const:`GNU_FORMAT`.
206
207
208The following variables are available on module level:
209
210
211.. data:: ENCODING
212
213   The default character encoding i.e. the value from either
214   :func:`sys.getfilesystemencoding` or :func:`sys.getdefaultencoding`.
215
216
217.. seealso::
218
219   Module :mod:`zipfile`
220      Documentation of the :mod:`zipfile` standard module.
221
222   `GNU tar manual, Basic Tar Format <http://www.gnu.org/software/tar/manual/html_node/Standard.html>`_
223      Documentation for tar archive files, including GNU tar extensions.
224
225
226.. _tarfile-objects:
227
228TarFile Objects
229---------------
230
231The :class:`TarFile` object provides an interface to a tar archive. A tar
232archive is a sequence of blocks. An archive member (a stored file) is made up of
233a header block followed by data blocks. It is possible to store a file in a tar
234archive several times. Each archive member is represented by a :class:`TarInfo`
235object, see :ref:`tarinfo-objects` for details.
236
237
238.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors=None, pax_headers=None, debug=0, errorlevel=0)
239
240   All following arguments are optional and can be accessed as instance attributes
241   as well.
242
243   *name* is the pathname of the archive. It can be omitted if *fileobj* is given.
244   In this case, the file object's :attr:`name` attribute is used if it exists.
245
246   *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append
247   data to an existing file or ``'w'`` to create a new file overwriting an existing
248   one.
249
250   If *fileobj* is given, it is used for reading or writing data. If it can be
251   determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used
252   from position 0.
253
254   .. note::
255
256      *fileobj* is not closed, when :class:`TarFile` is closed.
257
258   *format* controls the archive format. It must be one of the constants
259   :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are
260   defined at module level.
261
262   .. versionadded:: 2.6
263
264   The *tarinfo* argument can be used to replace the default :class:`TarInfo` class
265   with a different one.
266
267   .. versionadded:: 2.6
268
269   If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it
270   is :const:`True`, add the content of the target files to the archive. This has no
271   effect on systems that do not support symbolic links.
272
273   If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive.
274   If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members
275   as possible. This is only useful for reading concatenated or damaged archives.
276
277   *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug
278   messages). The messages are written to ``sys.stderr``.
279
280   If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`.
281   Nevertheless, they appear as error messages in the debug output, when debugging
282   is enabled.  If ``1``, all *fatal* errors are raised as :exc:`OSError` or
283   :exc:`IOError` exceptions. If ``2``, all *non-fatal* errors are raised as
284   :exc:`TarError` exceptions as well.
285
286   The *encoding* and *errors* arguments control the way strings are converted to
287   unicode objects and vice versa. The default settings will work for most users.
288   See section :ref:`tar-unicode` for in-depth information.
289
290   .. versionadded:: 2.6
291
292   The *pax_headers* argument is an optional dictionary of unicode strings which
293   will be added as a pax global header if *format* is :const:`PAX_FORMAT`.
294
295   .. versionadded:: 2.6
296
297
298.. method:: TarFile.open(...)
299
300   Alternative constructor. The :func:`tarfile.open` function is actually a
301   shortcut to this classmethod.
302
303
304.. method:: TarFile.getmember(name)
305
306   Return a :class:`TarInfo` object for member *name*. If *name* can not be found
307   in the archive, :exc:`KeyError` is raised.
308
309   .. note::
310
311      If a member occurs more than once in the archive, its last occurrence is assumed
312      to be the most up-to-date version.
313
314
315.. method:: TarFile.getmembers()
316
317   Return the members of the archive as a list of :class:`TarInfo` objects. The
318   list has the same order as the members in the archive.
319
320
321.. method:: TarFile.getnames()
322
323   Return the members as a list of their names. It has the same order as the list
324   returned by :meth:`getmembers`.
325
326
327.. method:: TarFile.list(verbose=True)
328
329   Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`,
330   only the names of the members are printed. If it is :const:`True`, output
331   similar to that of :program:`ls -l` is produced.
332
333
334.. method:: TarFile.next()
335
336   Return the next member of the archive as a :class:`TarInfo` object, when
337   :class:`TarFile` is opened for reading. Return :const:`None` if there is no more
338   available.
339
340
341.. method:: TarFile.extractall(path=".", members=None)
342
343   Extract all members from the archive to the current working directory or
344   directory *path*. If optional *members* is given, it must be a subset of the
345   list returned by :meth:`getmembers`. Directory information like owner,
346   modification time and permissions are set after all members have been extracted.
347   This is done to work around two problems: A directory's modification time is
348   reset each time a file is created in it. And, if a directory's permissions do
349   not allow writing, extracting files to it will fail.
350
351   .. warning::
352
353      Never extract archives from untrusted sources without prior inspection.
354      It is possible that files are created outside of *path*, e.g. members
355      that have absolute filenames starting with ``"/"`` or filenames with two
356      dots ``".."``.
357
358   .. versionadded:: 2.5
359
360
361.. method:: TarFile.extract(member, path="")
362
363   Extract a member from the archive to the current working directory, using its
364   full name. Its file information is extracted as accurately as possible. *member*
365   may be a filename or a :class:`TarInfo` object. You can specify a different
366   directory using *path*.
367
368   .. note::
369
370      The :meth:`extract` method does not take care of several extraction issues.
371      In most cases you should consider using the :meth:`extractall` method.
372
373   .. warning::
374
375      See the warning for :meth:`extractall`.
376
377
378.. method:: TarFile.extractfile(member)
379
380   Extract a member from the archive as a file object. *member* may be a filename
381   or a :class:`TarInfo` object. If *member* is a regular file, a file-like object
382   is returned. If *member* is a link, a file-like object is constructed from the
383   link's target. If *member* is none of the above, :const:`None` is returned.
384
385   .. note::
386
387      The file-like object is read-only.  It provides the methods
388      :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`seek`, :meth:`tell`,
389      and :meth:`close`, and also supports iteration over its lines.
390
391
392.. method:: TarFile.add(name, arcname=None, recursive=True, exclude=None)
393
394   Add the file *name* to the archive. *name* may be any type of file (directory,
395   fifo, symbolic link, etc.). If given, *arcname* specifies an alternative name
396   for the file in the archive. Directories are added recursively by default. This
397   can be avoided by setting *recursive* to :const:`False`. If *exclude* is given
398   it must be a function that takes one filename argument and returns a boolean
399   value. Depending on this value the respective file is either excluded
400   (:const:`True`) or added (:const:`False`).
401
402   .. versionchanged:: 2.6
403      Added the *exclude* parameter.
404
405
406.. method:: TarFile.addfile(tarinfo, fileobj=None)
407
408   Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given,
409   ``tarinfo.size`` bytes are read from it and added to the archive.  You can
410   create :class:`TarInfo` objects using :meth:`gettarinfo`.
411
412   .. note::
413
414      On Windows platforms, *fileobj* should always be opened with mode ``'rb'`` to
415      avoid irritation about the file size.
416
417
418.. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None)
419
420   Create a :class:`TarInfo` object for either the file *name* or the file object
421   *fileobj* (using :func:`os.fstat` on its file descriptor).  You can modify some
422   of the :class:`TarInfo`'s attributes before you add it using :meth:`addfile`.
423   If given, *arcname* specifies an alternative name for the file in the archive.
424
425
426.. method:: TarFile.close()
427
428   Close the :class:`TarFile`. In write mode, two finishing zero blocks are
429   appended to the archive.
430
431
432.. attribute:: TarFile.posix
433
434   Setting this to :const:`True` is equivalent to setting the :attr:`format`
435   attribute to :const:`USTAR_FORMAT`, :const:`False` is equivalent to
436   :const:`GNU_FORMAT`.
437
438   .. versionchanged:: 2.4
439      *posix* defaults to :const:`False`.
440
441   .. deprecated:: 2.6
442      Use the :attr:`format` attribute instead.
443
444
445.. attribute:: TarFile.pax_headers
446
447   A dictionary containing key-value pairs of pax global headers.
448
449   .. versionadded:: 2.6
450
451
452.. _tarinfo-objects:
453
454TarInfo Objects
455---------------
456
457A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside
458from storing all required attributes of a file (like file type, size, time,
459permissions, owner etc.), it provides some useful methods to determine its type.
460It does *not* contain the file's data itself.
461
462:class:`TarInfo` objects are returned by :class:`TarFile`'s methods
463:meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`.
464
465
466.. class:: TarInfo(name="")
467
468   Create a :class:`TarInfo` object.
469
470
471.. method:: TarInfo.frombuf(buf)
472
473   Create and return a :class:`TarInfo` object from string buffer *buf*.
474
475   .. versionadded:: 2.6
476      Raises :exc:`HeaderError` if the buffer is invalid..
477
478
479.. method:: TarInfo.fromtarfile(tarfile)
480
481   Read the next member from the :class:`TarFile` object *tarfile* and return it as
482   a :class:`TarInfo` object.
483
484   .. versionadded:: 2.6
485
486
487.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='strict')
488
489   Create a string buffer from a :class:`TarInfo` object. For information on the
490   arguments see the constructor of the :class:`TarFile` class.
491
492   .. versionchanged:: 2.6
493      The arguments were added.
494
495A ``TarInfo`` object has the following public data attributes:
496
497
498.. attribute:: TarInfo.name
499
500   Name of the archive member.
501
502
503.. attribute:: TarInfo.size
504
505   Size in bytes.
506
507
508.. attribute:: TarInfo.mtime
509
510   Time of last modification.
511
512
513.. attribute:: TarInfo.mode
514
515   Permission bits.
516
517
518.. attribute:: TarInfo.type
519
520   File type.  *type* is usually one of these constants: :const:`REGTYPE`,
521   :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`,
522   :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`,
523   :const:`GNUTYPE_SPARSE`.  To determine the type of a :class:`TarInfo` object
524   more conveniently, use the ``is_*()`` methods below.
525
526
527.. attribute:: TarInfo.linkname
528
529   Name of the target file name, which is only present in :class:`TarInfo` objects
530   of type :const:`LNKTYPE` and :const:`SYMTYPE`.
531
532
533.. attribute:: TarInfo.uid
534
535   User ID of the user who originally stored this member.
536
537
538.. attribute:: TarInfo.gid
539
540   Group ID of the user who originally stored this member.
541
542
543.. attribute:: TarInfo.uname
544
545   User name.
546
547
548.. attribute:: TarInfo.gname
549
550   Group name.
551
552
553.. attribute:: TarInfo.pax_headers
554
555   A dictionary containing key-value pairs of an associated pax extended header.
556
557   .. versionadded:: 2.6
558
559A :class:`TarInfo` object also provides some convenient query methods:
560
561
562.. method:: TarInfo.isfile()
563
564   Return :const:`True` if the :class:`Tarinfo` object is a regular file.
565
566
567.. method:: TarInfo.isreg()
568
569   Same as :meth:`isfile`.
570
571
572.. method:: TarInfo.isdir()
573
574   Return :const:`True` if it is a directory.
575
576
577.. method:: TarInfo.issym()
578
579   Return :const:`True` if it is a symbolic link.
580
581
582.. method:: TarInfo.islnk()
583
584   Return :const:`True` if it is a hard link.
585
586
587.. method:: TarInfo.ischr()
588
589   Return :const:`True` if it is a character device.
590
591
592.. method:: TarInfo.isblk()
593
594   Return :const:`True` if it is a block device.
595
596
597.. method:: TarInfo.isfifo()
598
599   Return :const:`True` if it is a FIFO.
600
601
602.. method:: TarInfo.isdev()
603
604   Return :const:`True` if it is one of character device, block device or FIFO.
605
606
607.. _tar-examples:
608
609Examples
610--------
611
612How to extract an entire tar archive to the current working directory::
613
614   import tarfile
615   tar = tarfile.open("sample.tar.gz")
616   tar.extractall()
617   tar.close()
618
619How to extract a subset of a tar archive with :meth:`TarFile.extractall` using
620a generator function instead of a list::
621
622   import os
623   import tarfile
624
625   def py_files(members):
626       for tarinfo in members:
627           if os.path.splitext(tarinfo.name)[1] == ".py":
628               yield tarinfo
629
630   tar = tarfile.open("sample.tar.gz")
631   tar.extractall(members=py_files(tar))
632   tar.close()
633
634How to create an uncompressed tar archive from a list of filenames::
635
636   import tarfile
637   tar = tarfile.open("sample.tar", "w")
638   for name in ["foo", "bar", "quux"]:
639       tar.add(name)
640   tar.close()
641
642How to read a gzip compressed tar archive and display some member information::
643
644   import tarfile
645   tar = tarfile.open("sample.tar.gz", "r:gz")
646   for tarinfo in tar:
647       print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
648       if tarinfo.isreg():
649           print "a regular file."
650       elif tarinfo.isdir():
651           print "a directory."
652       else:
653           print "something else."
654   tar.close()
655
656
657.. _tar-formats:
658
659Supported tar formats
660---------------------
661
662There are three tar formats that can be created with the :mod:`tarfile` module:
663
664* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames
665  up to a length of at best 256 characters and linknames up to 100 characters. The
666  maximum file size is 8 gigabytes. This is an old and limited but widely
667  supported format.
668
669* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and
670  linknames, files bigger than 8 gigabytes and sparse files. It is the de facto
671  standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar
672  extensions for long names, sparse file support is read-only.
673
674* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible
675  format with virtually no limits. It supports long filenames and linknames, large
676  files and stores pathnames in a portable way. However, not all tar
677  implementations today are able to handle pax archives properly.
678
679  The *pax* format is an extension to the existing *ustar* format. It uses extra
680  headers for information that cannot be stored otherwise. There are two flavours
681  of pax headers: Extended headers only affect the subsequent file header, global
682  headers are valid for the complete archive and affect all following files. All
683  the data in a pax header is encoded in *UTF-8* for portability reasons.
684
685There are some more variants of the tar format which can be read, but not
686created:
687
688* The ancient V7 format. This is the first tar format from Unix Seventh Edition,
689  storing only regular files and directories. Names must not be longer than 100
690  characters, there is no user/group name information. Some archives have
691  miscalculated header checksums in case of fields with non-ASCII characters.
692
693* The SunOS tar extended format. This format is a variant of the POSIX.1-2001
694  pax format, but is not compatible.
695
696.. _tar-unicode:
697
698Unicode issues
699--------------
700
701The tar format was originally conceived to make backups on tape drives with the
702main focus on preserving file system information. Nowadays tar archives are
703commonly used for file distribution and exchanging archives over networks. One
704problem of the original format (that all other formats are merely variants of)
705is that there is no concept of supporting different character encodings. For
706example, an ordinary tar archive created on a *UTF-8* system cannot be read
707correctly on a *Latin-1* system if it contains non-ASCII characters. Names (i.e.
708filenames, linknames, user/group names) containing these characters will appear
709damaged.  Unfortunately, there is no way to autodetect the encoding of an
710archive.
711
712The pax format was designed to solve this problem. It stores non-ASCII names
713using the universal character encoding *UTF-8*. When a pax archive is read,
714these *UTF-8* names are converted to the encoding of the local file system.
715
716The details of unicode conversion are controlled by the *encoding* and *errors*
717keyword arguments of the :class:`TarFile` class.
718
719The default value for *encoding* is the local character encoding. It is deduced
720from :func:`sys.getfilesystemencoding` and :func:`sys.getdefaultencoding`. In
721read mode, *encoding* is used exclusively to convert unicode names from a pax
722archive to strings in the local character encoding. In write mode, the use of
723*encoding* depends on the chosen archive format. In case of :const:`PAX_FORMAT`,
724input names that contain non-ASCII characters need to be decoded before being
725stored as *UTF-8* strings. The other formats do not make use of *encoding*
726unless unicode objects are used as input names. These are converted to 8-bit
727character strings before they are added to the archive.
728
729The *errors* argument defines how characters are treated that cannot be
730converted to or from *encoding*. Possible values are listed in section
731:ref:`codec-base-classes`. In read mode, there is an additional scheme
732``'utf-8'`` which means that bad characters are replaced by their *UTF-8*
733representation. This is the default scheme. In write mode the default value for
734*errors* is ``'strict'`` to ensure that name information is not altered
735unnoticed.
736