/Doc/library/multifile.rst

http://unladen-swallow.googlecode.com/ · ReStructuredText · 191 lines · 120 code · 71 blank · 0 comment · 0 complexity · fac62c1374c7dbbc7fcd4cf851b58862 MD5 · raw file

  1. :mod:`multifile` --- Support for files containing distinct parts
  2. ================================================================
  3. .. module:: multifile
  4. :synopsis: Support for reading files which contain distinct parts, such as some MIME data.
  5. :deprecated:
  6. .. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com>
  7. .. deprecated:: 2.5
  8. The :mod:`email` package should be used in preference to the :mod:`multifile`
  9. module. This module is present only to maintain backward compatibility.
  10. The :class:`MultiFile` object enables you to treat sections of a text file as
  11. file-like input objects, with ``''`` being returned by :meth:`readline` when a
  12. given delimiter pattern is encountered. The defaults of this class are designed
  13. to make it useful for parsing MIME multipart messages, but by subclassing it and
  14. overriding methods it can be easily adapted for more general use.
  15. .. class:: MultiFile(fp[, seekable])
  16. Create a multi-file. You must instantiate this class with an input object
  17. argument for the :class:`MultiFile` instance to get lines from, such as a file
  18. object returned by :func:`open`.
  19. :class:`MultiFile` only ever looks at the input object's :meth:`readline`,
  20. :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you
  21. want random access to the individual MIME parts. To use :class:`MultiFile` on a
  22. non-seekable stream object, set the optional *seekable* argument to false; this
  23. will prevent using the input object's :meth:`seek` and :meth:`tell` methods.
  24. It will be useful to know that in :class:`MultiFile`'s view of the world, text
  25. is composed of three kinds of lines: data, section-dividers, and end-markers.
  26. MultiFile is designed to support parsing of messages that may have multiple
  27. nested message parts, each with its own pattern for section-divider and
  28. end-marker lines.
  29. .. seealso::
  30. Module :mod:`email`
  31. Comprehensive email handling package; supersedes the :mod:`multifile` module.
  32. .. _multifile-objects:
  33. MultiFile Objects
  34. -----------------
  35. A :class:`MultiFile` instance has the following methods:
  36. .. method:: MultiFile.readline(str)
  37. Read a line. If the line is data (not a section-divider or end-marker or real
  38. EOF) return it. If the line matches the most-recently-stacked boundary, return
  39. ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an
  40. end-marker. If the line matches any other stacked boundary, raise an error. On
  41. encountering end-of-file on the underlying stream object, the method raises
  42. :exc:`Error` unless all boundaries have been popped.
  43. .. method:: MultiFile.readlines(str)
  44. Return all lines remaining in this part as a list of strings.
  45. .. method:: MultiFile.read()
  46. Read all lines, up to the next section. Return them as a single (multiline)
  47. string. Note that this doesn't take a size argument!
  48. .. method:: MultiFile.seek(pos[, whence])
  49. Seek. Seek indices are relative to the start of the current section. The *pos*
  50. and *whence* arguments are interpreted as for a file seek.
  51. .. method:: MultiFile.tell()
  52. Return the file position relative to the start of the current section.
  53. .. method:: MultiFile.next()
  54. Skip lines to the next section (that is, read lines until a section-divider or
  55. end-marker has been consumed). Return true if there is such a section, false if
  56. an end-marker is seen. Re-enable the most-recently-pushed boundary.
  57. .. method:: MultiFile.is_data(str)
  58. Return true if *str* is data and false if it might be a section boundary. As
  59. written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which
  60. all MIME boundaries have) but it is declared so it can be overridden in derived
  61. classes.
  62. Note that this test is used intended as a fast guard for the real boundary
  63. tests; if it always returns false it will merely slow processing, not cause it
  64. to fail.
  65. .. method:: MultiFile.push(str)
  66. Push a boundary string. When a decorated version of this boundary is found as
  67. an input line, it will be interpreted as a section-divider or end-marker
  68. (depending on the decoration, see :rfc:`2045`). All subsequent reads will
  69. return the empty string to indicate end-of-file, until a call to :meth:`pop`
  70. removes the boundary a or :meth:`next` call reenables it.
  71. It is possible to push more than one boundary. Encountering the
  72. most-recently-pushed boundary will return EOF; encountering any other
  73. boundary will raise an error.
  74. .. method:: MultiFile.pop()
  75. Pop a section boundary. This boundary will no longer be interpreted as EOF.
  76. .. method:: MultiFile.section_divider(str)
  77. Turn a boundary into a section-divider line. By default, this method
  78. prepends ``'--'`` (which MIME section boundaries have) but it is declared so
  79. it can be overridden in derived classes. This method need not append LF or
  80. CR-LF, as comparison with the result ignores trailing whitespace.
  81. .. method:: MultiFile.end_marker(str)
  82. Turn a boundary string into an end-marker line. By default, this method
  83. prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message
  84. marker) but it is declared so it can be overridden in derived classes. This
  85. method need not append LF or CR-LF, as comparison with the result ignores
  86. trailing whitespace.
  87. Finally, :class:`MultiFile` instances have two public instance variables:
  88. .. attribute:: MultiFile.level
  89. Nesting depth of the current part.
  90. .. attribute:: MultiFile.last
  91. True if the last end-of-file was for an end-of-message marker.
  92. .. _multifile-example:
  93. :class:`MultiFile` Example
  94. --------------------------
  95. .. sectionauthor:: Skip Montanaro <skip@pobox.com>
  96. ::
  97. import mimetools
  98. import multifile
  99. import StringIO
  100. def extract_mime_part_matching(stream, mimetype):
  101. """Return the first element in a multipart MIME message on stream
  102. matching mimetype."""
  103. msg = mimetools.Message(stream)
  104. msgtype = msg.gettype()
  105. params = msg.getplist()
  106. data = StringIO.StringIO()
  107. if msgtype[:10] == "multipart/":
  108. file = multifile.MultiFile(stream)
  109. file.push(msg.getparam("boundary"))
  110. while file.next():
  111. submsg = mimetools.Message(file)
  112. try:
  113. data = StringIO.StringIO()
  114. mimetools.decode(file, data, submsg.getencoding())
  115. except ValueError:
  116. continue
  117. if submsg.gettype() == mimetype:
  118. break
  119. file.pop()
  120. return data.getvalue()