PageRenderTime 30ms CodeModel.GetById 9ms app.highlight 13ms RepoModel.GetById 1ms app.codeStats 0ms

/Doc/library/xml.sax.handler.rst

http://unladen-swallow.googlecode.com/
ReStructuredText | 402 lines | 250 code | 152 blank | 0 comment | 0 complexity | 366b048cc0f0f1c158d767842bcb038a MD5 | raw file
  1
  2:mod:`xml.sax.handler` --- Base classes for SAX handlers
  3========================================================
  4
  5.. module:: xml.sax.handler
  6   :synopsis: Base classes for SAX event handlers.
  7.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
  8.. sectionauthor:: Martin v. Lรถwis <martin@v.loewis.de>
  9
 10
 11.. versionadded:: 2.0
 12
 13The SAX API defines four kinds of handlers: content handlers, DTD handlers,
 14error handlers, and entity resolvers. Applications normally only need to
 15implement those interfaces whose events they are interested in; they can
 16implement the interfaces in a single object or in multiple objects. Handler
 17implementations should inherit from the base classes provided in the module
 18:mod:`xml.sax.handler`, so that all methods get default implementations.
 19
 20
 21.. class:: ContentHandler
 22
 23   This is the main callback interface in SAX, and the one most important to
 24   applications. The order of events in this interface mirrors the order of the
 25   information in the document.
 26
 27
 28.. class:: DTDHandler
 29
 30   Handle DTD events.
 31
 32   This interface specifies only those DTD events required for basic parsing
 33   (unparsed entities and attributes).
 34
 35
 36.. class:: EntityResolver
 37
 38   Basic interface for resolving entities. If you create an object implementing
 39   this interface, then register the object with your Parser, the parser will call
 40   the method in your object to resolve all external entities.
 41
 42
 43.. class:: ErrorHandler
 44
 45   Interface used by the parser to present error and warning messages to the
 46   application.  The methods of this object control whether errors are immediately
 47   converted to exceptions or are handled in some other way.
 48
 49In addition to these classes, :mod:`xml.sax.handler` provides symbolic constants
 50for the feature and property names.
 51
 52
 53.. data:: feature_namespaces
 54
 55   Value: ``"http://xml.org/sax/features/namespaces"`` ---  true: Perform Namespace
 56   processing. ---  false: Optionally do not perform Namespace processing (implies
 57   namespace-prefixes; default). ---  access: (parsing) read-only; (not parsing)
 58   read/write
 59
 60
 61.. data:: feature_namespace_prefixes
 62
 63   Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report
 64   the original prefixed names and attributes used for Namespace
 65   declarations. --- false: Do not report attributes used for Namespace
 66   declarations, and optionally do not report original prefixed names
 67   (default). --- access: (parsing) read-only; (not parsing) read/write
 68
 69
 70.. data:: feature_string_interning
 71
 72   Value: ``"http://xml.org/sax/features/string-interning"`` ---  true: All element
 73   names, prefixes, attribute names, Namespace URIs, and local names are interned
 74   using the built-in intern function. ---  false: Names are not necessarily
 75   interned, although they may be (default). ---  access: (parsing) read-only; (not
 76   parsing) read/write
 77
 78
 79.. data:: feature_validation
 80
 81   Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all
 82   validation errors (implies external-general-entities and
 83   external-parameter-entities). --- false: Do not report validation errors. ---
 84   access: (parsing) read-only; (not parsing) read/write
 85
 86
 87.. data:: feature_external_ges
 88
 89   Value: ``"http://xml.org/sax/features/external-general-entities"`` ---  true:
 90   Include all external general (text) entities. ---  false: Do not include
 91   external general entities. ---  access: (parsing) read-only; (not parsing)
 92   read/write
 93
 94
 95.. data:: feature_external_pes
 96
 97   Value: ``"http://xml.org/sax/features/external-parameter-entities"`` ---  true:
 98   Include all external parameter entities, including the external DTD subset. ---
 99   false: Do not include any external parameter entities, even the external DTD
100   subset. ---  access: (parsing) read-only; (not parsing) read/write
101
102
103.. data:: all_features
104
105   List of all features.
106
107
108.. data:: property_lexical_handler
109
110   Value: ``"http://xml.org/sax/properties/lexical-handler"`` ---  data type:
111   xml.sax.sax2lib.LexicalHandler (not supported in Python 2) ---  description: An
112   optional extension handler for lexical events like comments. ---  access:
113   read/write
114
115
116.. data:: property_declaration_handler
117
118   Value: ``"http://xml.org/sax/properties/declaration-handler"`` ---  data type:
119   xml.sax.sax2lib.DeclHandler (not supported in Python 2) ---  description: An
120   optional extension handler for DTD-related events other than notations and
121   unparsed entities. ---  access: read/write
122
123
124.. data:: property_dom_node
125
126   Value: ``"http://xml.org/sax/properties/dom-node"`` ---  data type:
127   org.w3c.dom.Node (not supported in Python 2)  ---  description: When parsing,
128   the current DOM node being visited if this is a DOM iterator; when not parsing,
129   the root DOM node for iteration. ---  access: (parsing) read-only; (not parsing)
130   read/write
131
132
133.. data:: property_xml_string
134
135   Value: ``"http://xml.org/sax/properties/xml-string"`` ---  data type: String ---
136   description: The literal string of characters that was the source for the
137   current event. ---  access: read-only
138
139
140.. data:: all_properties
141
142   List of all known property names.
143
144
145.. _content-handler-objects:
146
147ContentHandler Objects
148----------------------
149
150Users are expected to subclass :class:`ContentHandler` to support their
151application.  The following methods are called by the parser on the appropriate
152events in the input document:
153
154
155.. method:: ContentHandler.setDocumentLocator(locator)
156
157   Called by the parser to give the application a locator for locating the origin
158   of document events.
159
160   SAX parsers are strongly encouraged (though not absolutely required) to supply a
161   locator: if it does so, it must supply the locator to the application by
162   invoking this method before invoking any of the other methods in the
163   DocumentHandler interface.
164
165   The locator allows the application to determine the end position of any
166   document-related event, even if the parser is not reporting an error. Typically,
167   the application will use this information for reporting its own errors (such as
168   character content that does not match an application's business rules). The
169   information returned by the locator is probably not sufficient for use with a
170   search engine.
171
172   Note that the locator will return correct information only during the invocation
173   of the events in this interface. The application should not attempt to use it at
174   any other time.
175
176
177.. method:: ContentHandler.startDocument()
178
179   Receive notification of the beginning of a document.
180
181   The SAX parser will invoke this method only once, before any other methods in
182   this interface or in DTDHandler (except for :meth:`setDocumentLocator`).
183
184
185.. method:: ContentHandler.endDocument()
186
187   Receive notification of the end of a document.
188
189   The SAX parser will invoke this method only once, and it will be the last method
190   invoked during the parse. The parser shall not invoke this method until it has
191   either abandoned parsing (because of an unrecoverable error) or reached the end
192   of input.
193
194
195.. method:: ContentHandler.startPrefixMapping(prefix, uri)
196
197   Begin the scope of a prefix-URI Namespace mapping.
198
199   The information from this event is not necessary for normal Namespace
200   processing: the SAX XML reader will automatically replace prefixes for element
201   and attribute names when the ``feature_namespaces`` feature is enabled (the
202   default).
203
204   There are cases, however, when applications need to use prefixes in character
205   data or in attribute values, where they cannot safely be expanded automatically;
206   the :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events supply the
207   information to the application to expand prefixes in those contexts itself, if
208   necessary.
209
210   .. XXX This is not really the default, is it? MvL
211
212   Note that :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events are not
213   guaranteed to be properly nested relative to each-other: all
214   :meth:`startPrefixMapping` events will occur before the corresponding
215   :meth:`startElement` event, and all :meth:`endPrefixMapping` events will occur
216   after the corresponding :meth:`endElement` event, but their order is not
217   guaranteed.
218
219
220.. method:: ContentHandler.endPrefixMapping(prefix)
221
222   End the scope of a prefix-URI mapping.
223
224   See :meth:`startPrefixMapping` for details. This event will always occur after
225   the corresponding :meth:`endElement` event, but the order of
226   :meth:`endPrefixMapping` events is not otherwise guaranteed.
227
228
229.. method:: ContentHandler.startElement(name, attrs)
230
231   Signals the start of an element in non-namespace mode.
232
233   The *name* parameter contains the raw XML 1.0 name of the element type as a
234   string and the *attrs* parameter holds an object of the :class:`Attributes`
235   interface (see :ref:`attributes-objects`) containing the attributes of
236   the element.  The object passed as *attrs* may be re-used by the parser; holding
237   on to a reference to it is not a reliable way to keep a copy of the attributes.
238   To keep a copy of the attributes, use the :meth:`copy` method of the *attrs*
239   object.
240
241
242.. method:: ContentHandler.endElement(name)
243
244   Signals the end of an element in non-namespace mode.
245
246   The *name* parameter contains the name of the element type, just as with the
247   :meth:`startElement` event.
248
249
250.. method:: ContentHandler.startElementNS(name, qname, attrs)
251
252   Signals the start of an element in namespace mode.
253
254   The *name* parameter contains the name of the element type as a ``(uri,
255   localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in
256   the source document, and the *attrs* parameter holds an instance of the
257   :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`)
258   containing the attributes of the element.  If no namespace is associated with
259   the element, the *uri* component of *name* will be ``None``.  The object passed
260   as *attrs* may be re-used by the parser; holding on to a reference to it is not
261   a reliable way to keep a copy of the attributes.  To keep a copy of the
262   attributes, use the :meth:`copy` method of the *attrs* object.
263
264   Parsers may set the *qname* parameter to ``None``, unless the
265   ``feature_namespace_prefixes`` feature is activated.
266
267
268.. method:: ContentHandler.endElementNS(name, qname)
269
270   Signals the end of an element in namespace mode.
271
272   The *name* parameter contains the name of the element type, just as with the
273   :meth:`startElementNS` method, likewise the *qname* parameter.
274
275
276.. method:: ContentHandler.characters(content)
277
278   Receive notification of character data.
279
280   The Parser will call this method to report each chunk of character data. SAX
281   parsers may return all contiguous character data in a single chunk, or they may
282   split it into several chunks; however, all of the characters in any single event
283   must come from the same external entity so that the Locator provides useful
284   information.
285
286   *content* may be a Unicode string or a byte string; the ``expat`` reader module
287   produces always Unicode strings.
288
289   .. note::
290
291      The earlier SAX 1 interface provided by the Python XML Special Interest Group
292      used a more Java-like interface for this method.  Since most parsers used from
293      Python did not take advantage of the older interface, the simpler signature was
294      chosen to replace it.  To convert old code to the new interface, use *content*
295      instead of slicing content with the old *offset* and *length* parameters.
296
297
298.. method:: ContentHandler.ignorableWhitespace(whitespace)
299
300   Receive notification of ignorable whitespace in element content.
301
302   Validating Parsers must use this method to report each chunk of ignorable
303   whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating
304   parsers may also use this method if they are capable of parsing and using
305   content models.
306
307   SAX parsers may return all contiguous whitespace in a single chunk, or they may
308   split it into several chunks; however, all of the characters in any single event
309   must come from the same external entity, so that the Locator provides useful
310   information.
311
312
313.. method:: ContentHandler.processingInstruction(target, data)
314
315   Receive notification of a processing instruction.
316
317   The Parser will invoke this method once for each processing instruction found:
318   note that processing instructions may occur before or after the main document
319   element.
320
321   A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a
322   text declaration (XML 1.0, section 4.3.1) using this method.
323
324
325.. method:: ContentHandler.skippedEntity(name)
326
327   Receive notification of a skipped entity.
328
329   The Parser will invoke this method once for each entity skipped. Non-validating
330   processors may skip entities if they have not seen the declarations (because,
331   for example, the entity was declared in an external DTD subset). All processors
332   may skip external entities, depending on the values of the
333   ``feature_external_ges`` and the ``feature_external_pes`` properties.
334
335
336.. _dtd-handler-objects:
337
338DTDHandler Objects
339------------------
340
341:class:`DTDHandler` instances provide the following methods:
342
343
344.. method:: DTDHandler.notationDecl(name, publicId, systemId)
345
346   Handle a notation declaration event.
347
348
349.. method:: DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata)
350
351   Handle an unparsed entity declaration event.
352
353
354.. _entity-resolver-objects:
355
356EntityResolver Objects
357----------------------
358
359
360.. method:: EntityResolver.resolveEntity(publicId, systemId)
361
362   Resolve the system identifier of an entity and return either the system
363   identifier to read from as a string, or an InputSource to read from. The default
364   implementation returns *systemId*.
365
366
367.. _sax-error-handler:
368
369ErrorHandler Objects
370--------------------
371
372Objects with this interface are used to receive error and warning information
373from the :class:`XMLReader`.  If you create an object that implements this
374interface, then register the object with your :class:`XMLReader`, the parser
375will call the methods in your object to report all warnings and errors. There
376are three levels of errors available: warnings, (possibly) recoverable errors,
377and unrecoverable errors.  All methods take a :exc:`SAXParseException` as the
378only parameter.  Errors and warnings may be converted to an exception by raising
379the passed-in exception object.
380
381
382.. method:: ErrorHandler.error(exception)
383
384   Called when the parser encounters a recoverable error.  If this method does not
385   raise an exception, parsing may continue, but further document information
386   should not be expected by the application.  Allowing the parser to continue may
387   allow additional errors to be discovered in the input document.
388
389
390.. method:: ErrorHandler.fatalError(exception)
391
392   Called when the parser encounters an error it cannot recover from; parsing is
393   expected to terminate when this method returns.
394
395
396.. method:: ErrorHandler.warning(exception)
397
398   Called when the parser presents minor warning information to the application.
399   Parsing is expected to continue when this method returns, and document
400   information will continue to be passed to the application. Raising an exception
401   in this method will cause parsing to end.
402