/Doc/library/stringprep.rst

http://unladen-swallow.googlecode.com/ · ReStructuredText · 143 lines · 77 code · 66 blank · 0 comment · 0 complexity · 783b7bc67a4c975503ebb07f7d804302 MD5 · raw file

  1. :mod:`stringprep` --- Internet String Preparation
  2. =================================================
  3. .. module:: stringprep
  4. :synopsis: String preparation, as per RFC 3453
  5. :deprecated:
  6. .. moduleauthor:: Martin v. Lรถwis <martin@v.loewis.de>
  7. .. sectionauthor:: Martin v. Lรถwis <martin@v.loewis.de>
  8. .. versionadded:: 2.3
  9. When identifying things (such as host names) in the internet, it is often
  10. necessary to compare such identifications for "equality". Exactly how this
  11. comparison is executed may depend on the application domain, e.g. whether it
  12. should be case-insensitive or not. It may be also necessary to restrict the
  13. possible identifications, to allow only identifications consisting of
  14. "printable" characters.
  15. :rfc:`3454` defines a procedure for "preparing" Unicode strings in internet
  16. protocols. Before passing strings onto the wire, they are processed with the
  17. preparation procedure, after which they have a certain normalized form. The RFC
  18. defines a set of tables, which can be combined into profiles. Each profile must
  19. define which tables it uses, and what other optional parts of the ``stringprep``
  20. procedure are part of the profile. One example of a ``stringprep`` profile is
  21. ``nameprep``, which is used for internationalized domain names.
  22. The module :mod:`stringprep` only exposes the tables from RFC 3454. As these
  23. tables would be very large to represent them as dictionaries or lists, the
  24. module uses the Unicode character database internally. The module source code
  25. itself was generated using the ``mkstringprep.py`` utility.
  26. As a result, these tables are exposed as functions, not as data structures.
  27. There are two kinds of tables in the RFC: sets and mappings. For a set,
  28. :mod:`stringprep` provides the "characteristic function", i.e. a function that
  29. returns true if the parameter is part of the set. For mappings, it provides the
  30. mapping function: given the key, it returns the associated value. Below is a
  31. list of all functions available in the module.
  32. .. function:: in_table_a1(code)
  33. Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2).
  34. .. function:: in_table_b1(code)
  35. Determine whether *code* is in tableB.1 (Commonly mapped to nothing).
  36. .. function:: map_table_b2(code)
  37. Return the mapped value for *code* according to tableB.2 (Mapping for
  38. case-folding used with NFKC).
  39. .. function:: map_table_b3(code)
  40. Return the mapped value for *code* according to tableB.3 (Mapping for
  41. case-folding used with no normalization).
  42. .. function:: in_table_c11(code)
  43. Determine whether *code* is in tableC.1.1 (ASCII space characters).
  44. .. function:: in_table_c12(code)
  45. Determine whether *code* is in tableC.1.2 (Non-ASCII space characters).
  46. .. function:: in_table_c11_c12(code)
  47. Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and
  48. C.1.2).
  49. .. function:: in_table_c21(code)
  50. Determine whether *code* is in tableC.2.1 (ASCII control characters).
  51. .. function:: in_table_c22(code)
  52. Determine whether *code* is in tableC.2.2 (Non-ASCII control characters).
  53. .. function:: in_table_c21_c22(code)
  54. Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and
  55. C.2.2).
  56. .. function:: in_table_c3(code)
  57. Determine whether *code* is in tableC.3 (Private use).
  58. .. function:: in_table_c4(code)
  59. Determine whether *code* is in tableC.4 (Non-character code points).
  60. .. function:: in_table_c5(code)
  61. Determine whether *code* is in tableC.5 (Surrogate codes).
  62. .. function:: in_table_c6(code)
  63. Determine whether *code* is in tableC.6 (Inappropriate for plain text).
  64. .. function:: in_table_c7(code)
  65. Determine whether *code* is in tableC.7 (Inappropriate for canonical
  66. representation).
  67. .. function:: in_table_c8(code)
  68. Determine whether *code* is in tableC.8 (Change display properties or are
  69. deprecated).
  70. .. function:: in_table_c9(code)
  71. Determine whether *code* is in tableC.9 (Tagging characters).
  72. .. function:: in_table_d1(code)
  73. Determine whether *code* is in tableD.1 (Characters with bidirectional property
  74. "R" or "AL").
  75. .. function:: in_table_d2(code)
  76. Determine whether *code* is in tableD.2 (Characters with bidirectional property
  77. "L").