/Doc/library/stringprep.rst
http://unladen-swallow.googlecode.com/ · ReStructuredText · 143 lines · 77 code · 66 blank · 0 comment · 0 complexity · 783b7bc67a4c975503ebb07f7d804302 MD5 · raw file
- :mod:`stringprep` --- Internet String Preparation
- =================================================
- .. module:: stringprep
- :synopsis: String preparation, as per RFC 3453
- :deprecated:
- .. moduleauthor:: Martin v. Lรถwis <martin@v.loewis.de>
- .. sectionauthor:: Martin v. Lรถwis <martin@v.loewis.de>
- .. versionadded:: 2.3
- When identifying things (such as host names) in the internet, it is often
- necessary to compare such identifications for "equality". Exactly how this
- comparison is executed may depend on the application domain, e.g. whether it
- should be case-insensitive or not. It may be also necessary to restrict the
- possible identifications, to allow only identifications consisting of
- "printable" characters.
- :rfc:`3454` defines a procedure for "preparing" Unicode strings in internet
- protocols. Before passing strings onto the wire, they are processed with the
- preparation procedure, after which they have a certain normalized form. The RFC
- defines a set of tables, which can be combined into profiles. Each profile must
- define which tables it uses, and what other optional parts of the ``stringprep``
- procedure are part of the profile. One example of a ``stringprep`` profile is
- ``nameprep``, which is used for internationalized domain names.
- The module :mod:`stringprep` only exposes the tables from RFC 3454. As these
- tables would be very large to represent them as dictionaries or lists, the
- module uses the Unicode character database internally. The module source code
- itself was generated using the ``mkstringprep.py`` utility.
- As a result, these tables are exposed as functions, not as data structures.
- There are two kinds of tables in the RFC: sets and mappings. For a set,
- :mod:`stringprep` provides the "characteristic function", i.e. a function that
- returns true if the parameter is part of the set. For mappings, it provides the
- mapping function: given the key, it returns the associated value. Below is a
- list of all functions available in the module.
- .. function:: in_table_a1(code)
- Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2).
- .. function:: in_table_b1(code)
- Determine whether *code* is in tableB.1 (Commonly mapped to nothing).
- .. function:: map_table_b2(code)
- Return the mapped value for *code* according to tableB.2 (Mapping for
- case-folding used with NFKC).
- .. function:: map_table_b3(code)
- Return the mapped value for *code* according to tableB.3 (Mapping for
- case-folding used with no normalization).
- .. function:: in_table_c11(code)
- Determine whether *code* is in tableC.1.1 (ASCII space characters).
- .. function:: in_table_c12(code)
- Determine whether *code* is in tableC.1.2 (Non-ASCII space characters).
- .. function:: in_table_c11_c12(code)
- Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and
- C.1.2).
- .. function:: in_table_c21(code)
- Determine whether *code* is in tableC.2.1 (ASCII control characters).
- .. function:: in_table_c22(code)
- Determine whether *code* is in tableC.2.2 (Non-ASCII control characters).
- .. function:: in_table_c21_c22(code)
- Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and
- C.2.2).
- .. function:: in_table_c3(code)
- Determine whether *code* is in tableC.3 (Private use).
- .. function:: in_table_c4(code)
- Determine whether *code* is in tableC.4 (Non-character code points).
- .. function:: in_table_c5(code)
- Determine whether *code* is in tableC.5 (Surrogate codes).
- .. function:: in_table_c6(code)
- Determine whether *code* is in tableC.6 (Inappropriate for plain text).
- .. function:: in_table_c7(code)
- Determine whether *code* is in tableC.7 (Inappropriate for canonical
- representation).
- .. function:: in_table_c8(code)
- Determine whether *code* is in tableC.8 (Change display properties or are
- deprecated).
- .. function:: in_table_c9(code)
- Determine whether *code* is in tableC.9 (Tagging characters).
- .. function:: in_table_d1(code)
- Determine whether *code* is in tableD.1 (Characters with bidirectional property
- "R" or "AL").
- .. function:: in_table_d2(code)
- Determine whether *code* is in tableD.2 (Characters with bidirectional property
- "L").