PageRenderTime 28ms CodeModel.GetById 12ms app.highlight 12ms RepoModel.GetById 1ms app.codeStats 0ms

/Doc/library/stringprep.rst

http://unladen-swallow.googlecode.com/
ReStructuredText | 143 lines | 77 code | 66 blank | 0 comment | 0 complexity | 783b7bc67a4c975503ebb07f7d804302 MD5 | raw file
  1
  2:mod:`stringprep` --- Internet String Preparation
  3=================================================
  4
  5.. module:: stringprep
  6   :synopsis: String preparation, as per RFC 3453
  7   :deprecated:
  8.. moduleauthor:: Martin v. Lรถwis <martin@v.loewis.de>
  9.. sectionauthor:: Martin v. Lรถwis <martin@v.loewis.de>
 10
 11
 12.. versionadded:: 2.3
 13
 14When identifying things (such as host names) in the internet, it is often
 15necessary to compare such identifications for "equality". Exactly how this
 16comparison is executed may depend on the application domain, e.g. whether it
 17should be case-insensitive or not. It may be also necessary to restrict the
 18possible identifications, to allow only identifications consisting of
 19"printable" characters.
 20
 21:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet
 22protocols. Before passing strings onto the wire, they are processed with the
 23preparation procedure, after which they have a certain normalized form. The RFC
 24defines a set of tables, which can be combined into profiles. Each profile must
 25define which tables it uses, and what other optional parts of the ``stringprep``
 26procedure are part of the profile. One example of a ``stringprep`` profile is
 27``nameprep``, which is used for internationalized domain names.
 28
 29The module :mod:`stringprep` only exposes the tables from RFC 3454. As these
 30tables would be very large to represent them as dictionaries or lists, the
 31module uses the Unicode character database internally. The module source code
 32itself was generated using the ``mkstringprep.py`` utility.
 33
 34As a result, these tables are exposed as functions, not as data structures.
 35There are two kinds of tables in the RFC: sets and mappings. For a set,
 36:mod:`stringprep` provides the "characteristic function", i.e. a function that
 37returns true if the parameter is part of the set. For mappings, it provides the
 38mapping function: given the key, it returns the associated value. Below is a
 39list of all functions available in the module.
 40
 41
 42.. function:: in_table_a1(code)
 43
 44   Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2).
 45
 46
 47.. function:: in_table_b1(code)
 48
 49   Determine whether *code* is in tableB.1 (Commonly mapped to nothing).
 50
 51
 52.. function:: map_table_b2(code)
 53
 54   Return the mapped value for *code* according to tableB.2 (Mapping for
 55   case-folding used with NFKC).
 56
 57
 58.. function:: map_table_b3(code)
 59
 60   Return the mapped value for *code* according to tableB.3 (Mapping for
 61   case-folding used with no normalization).
 62
 63
 64.. function:: in_table_c11(code)
 65
 66   Determine whether *code* is in tableC.1.1  (ASCII space characters).
 67
 68
 69.. function:: in_table_c12(code)
 70
 71   Determine whether *code* is in tableC.1.2  (Non-ASCII space characters).
 72
 73
 74.. function:: in_table_c11_c12(code)
 75
 76   Determine whether *code* is in tableC.1  (Space characters, union of C.1.1 and
 77   C.1.2).
 78
 79
 80.. function:: in_table_c21(code)
 81
 82   Determine whether *code* is in tableC.2.1  (ASCII control characters).
 83
 84
 85.. function:: in_table_c22(code)
 86
 87   Determine whether *code* is in tableC.2.2  (Non-ASCII control characters).
 88
 89
 90.. function:: in_table_c21_c22(code)
 91
 92   Determine whether *code* is in tableC.2  (Control characters, union of C.2.1 and
 93   C.2.2).
 94
 95
 96.. function:: in_table_c3(code)
 97
 98   Determine whether *code* is in tableC.3  (Private use).
 99
100
101.. function:: in_table_c4(code)
102
103   Determine whether *code* is in tableC.4  (Non-character code points).
104
105
106.. function:: in_table_c5(code)
107
108   Determine whether *code* is in tableC.5  (Surrogate codes).
109
110
111.. function:: in_table_c6(code)
112
113   Determine whether *code* is in tableC.6  (Inappropriate for plain text).
114
115
116.. function:: in_table_c7(code)
117
118   Determine whether *code* is in tableC.7  (Inappropriate for canonical
119   representation).
120
121
122.. function:: in_table_c8(code)
123
124   Determine whether *code* is in tableC.8  (Change display properties or are
125   deprecated).
126
127
128.. function:: in_table_c9(code)
129
130   Determine whether *code* is in tableC.9  (Tagging characters).
131
132
133.. function:: in_table_d1(code)
134
135   Determine whether *code* is in tableD.1  (Characters with bidirectional property
136   "R" or "AL").
137
138
139.. function:: in_table_d2(code)
140
141   Determine whether *code* is in tableD.2  (Characters with bidirectional property
142   "L").
143