PageRenderTime 61ms CodeModel.GetById 23ms RepoModel.GetById 0ms app.codeStats 0ms

/documentation_files/re.py

https://github.com/kensington/kdevelop-python
Python | 516 lines | 323 code | 57 blank | 136 comment | 51 complexity | 6758ae33db65095c77159216625b9315 MD5 | raw file
  1. #!/usr/bin/env python2.7
  2. # -*- coding: utf-8 -*-
  3. """:synopsis: Regular expression operations.
  4. """
  5. """IGNORECASE
  6. Perform case-insensitive matching; expressions like ``[A-Z]`` will match
  7. lowercase letters, too. This is not affected by the current locale.
  8. """
  9. I = None
  10. """LOCALE
  11. Make ``\w``, ``\W``, ``\b``, ``\B``, ``\s`` and ``\S`` dependent on the
  12. current locale.
  13. """
  14. L = None
  15. """MULTILINE
  16. When specified, the pattern character ``'^'`` matches at the beginning of the
  17. string and at the beginning of each line (immediately following each newline);
  18. and the pattern character ``'$'`` matches at the end of the string and at the
  19. end of each line (immediately preceding each newline). By default, ``'^'``
  20. matches only at the beginning of the string, and ``'$'`` only at the end of the
  21. string and immediately before the newline (if any) at the end of the string.
  22. """
  23. M = None
  24. """DOTALL
  25. Make the ``'.'`` special character match any character at all, including a
  26. newline; without this flag, ``'.'`` will match anything *except* a newline.
  27. """
  28. S = None
  29. """UNICODE
  30. Make ``\w``, ``\W``, ``\b``, ``\B``, ``\d``, ``\D``, ``\s`` and ``\S`` dependent
  31. on the Unicode character properties database.
  32. """
  33. U = None
  34. """VERBOSE
  35. This flag allows you to write regular expressions that look nicer. Whitespace
  36. within the pattern is ignored, except when in a character class or preceded by
  37. an unescaped backslash, and, when a line contains a ``'#'`` neither in a
  38. character class or preceded by an unescaped backslash, all characters from the
  39. leftmost such ``'#'`` through the end of the line are ignored.
  40. That means that the two following regular expression objects that match a
  41. decimal number are functionally equal::
  42. a = re.compile(r " " " \d + # the integral part
  43. \. # the decimal point
  44. \d * # some fractional digits " " " , re.X)
  45. b = re.compile(r"\d+\.\d*")
  46. """
  47. X = None
  48. def compile(pattern,flags):
  49. """
  50. Compile a regular expression pattern into a regular expression object, which
  51. can be used for matching using its :func:`match` and :func:`search` methods,
  52. described below.
  53. The expression's behaviour can be modified by specifying a *flags* value.
  54. Values can be any of the following variables, combined using bitwise OR (the
  55. ``|`` operator).
  56. The sequence ::
  57. prog = re.compile(pattern)
  58. result = prog.match(string)
  59. is equivalent to ::
  60. result = re.match(pattern, string)
  61. but using :func:`re.compile` and saving the resulting regular expression
  62. object for reuse is more efficient when the expression will be used several
  63. times in a single program.
  64. """
  65. pass
  66. def search(pattern,string,flags):
  67. """
  68. Scan through *string* looking for a location where the regular expression
  69. *pattern* produces a match, and return a corresponding :class:`MatchObject`
  70. instance. Return ``None`` if no position in the string matches the pattern; note
  71. that this is different from finding a zero-length match at some point in the
  72. string.
  73. """
  74. pass
  75. def match(pattern,string,flags):
  76. """
  77. If zero or more characters at the beginning of *string* match the regular
  78. expression *pattern*, return a corresponding :class:`MatchObject` instance.
  79. Return ``None`` if the string does not match the pattern; note that this is
  80. different from a zero-length match.
  81. """
  82. pass
  83. def split(pattern,string,maxsplit=0,flags=0):
  84. """
  85. Split *string* by the occurrences of *pattern*. If capturing parentheses are
  86. used in *pattern*, then the text of all groups in the pattern are also returned
  87. as part of the resulting list. If *maxsplit* is nonzero, at most *maxsplit*
  88. splits occur, and the remainder of the string is returned as the final element
  89. of the list. (Incompatibility note: in the original Python 1.5 release,
  90. *maxsplit* was ignored. This has been fixed in later releases.)
  91. >>> re.split('\W+', 'Words, words, words.')
  92. ['Words', 'words', 'words', '']
  93. >>> re.split('(\W+)', 'Words, words, words.')
  94. ['Words', ', ', 'words', ', ', 'words', '.', '']
  95. >>> re.split('\W+', 'Words, words, words.', 1)
  96. ['Words', 'words, words.']
  97. >>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)
  98. ['0', '3', '9']
  99. If there are capturing groups in the separator and it matches at the start of
  100. the string, the result will start with an empty string. The same holds for
  101. the end of the string:
  102. >>> re.split('(\W+)', 'morewords, wordsmore')
  103. ['', 'more', 'words', ', ', 'words', 'more', '']
  104. That way, separator components are always found at the same relative
  105. indices within the result list (e.g., if there's one capturing group
  106. in the separator, the 0th, the 2nd and so forth).
  107. Note that *split* will never split a string on an empty pattern match.
  108. For example:
  109. >>> re.split('x*', 'foo')
  110. ['foo']
  111. >>> re.split("(?m)^$", "foo\n\nbar\n")
  112. ['foo\n\nbar\n']
  113. """
  114. pass
  115. def findall(pattern,string,flags):
  116. """
  117. Return all non-overlapping matches of *pattern* in *string*, as a list of
  118. strings. The *string* is scanned left-to-right, and matches are returned in
  119. the order found. If one or more groups are present in the pattern, return a
  120. list of groups; this will be a list of tuples if the pattern has more than
  121. one group. Empty matches are included in the result unless they touch the
  122. beginning of another match.
  123. """
  124. pass
  125. def finditer(pattern,string,flags):
  126. """
  127. Return an :term:`iterator` yielding :class:`MatchObject` instances over all
  128. non-overlapping matches for the RE *pattern* in *string*. The *string* is
  129. scanned left-to-right, and matches are returned in the order found. Empty
  130. matches are included in the result unless they touch the beginning of another
  131. match.
  132. """
  133. pass
  134. def sub(pattern,repl,string,count,flags):
  135. """
  136. Return the string obtained by replacing the leftmost non-overlapping occurrences
  137. of *pattern* in *string* by the replacement *repl*. If the pattern isn't found,
  138. *string* is returned unchanged. *repl* can be a string or a function; if it is
  139. a string, any backslash escapes in it are processed. That is, ``\n`` is
  140. converted to a single newline character, ``\r`` is converted to a linefeed, and
  141. so forth. Unknown escapes such as ``\j`` are left alone. Backreferences, such
  142. as ``\6``, are replaced with the substring matched by group 6 in the pattern.
  143. For example:
  144. >>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
  145. more r'static PyObject*\npy_\1(void)\n{',
  146. more 'def myfunc():')
  147. 'static PyObject*\npy_myfunc(void)\n{'
  148. If *repl* is a function, it is called for every non-overlapping occurrence of
  149. *pattern*. The function takes a single match object argument, and returns the
  150. replacement string. For example:
  151. >>> def dashrepl(matchobj):
  152. more if matchobj.group(0) == '-': return ' '
  153. more else: return '-'
  154. >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
  155. 'pro--gram files'
  156. >>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
  157. 'Baked Beans & Spam'
  158. The pattern may be a string or an RE object.
  159. The optional argument *count* is the maximum number of pattern occurrences to be
  160. replaced; *count* must be a non-negative integer. If omitted or zero, all
  161. occurrences will be replaced. Empty matches for the pattern are replaced only
  162. when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns
  163. ``'-a-b-c-'``.
  164. In addition to character escapes and backreferences as described above,
  165. ``\g<name>`` will use the substring matched by the group named ``name``, as
  166. defined by the ``(?P<name>more)`` syntax. ``\g<number>`` uses the corresponding
  167. group number; ``\g<2>`` is therefore equivalent to ``\2``, but isn't ambiguous
  168. in a replacement such as ``\g<2>0``. ``\20`` would be interpreted as a
  169. reference to group 20, not a reference to group 2 followed by the literal
  170. character ``'0'``. The backreference ``\g<0>`` substitutes in the entire
  171. substring matched by the RE.
  172. """
  173. pass
  174. def subn(pattern,repl,string,count,flags):
  175. """
  176. Perform the same operation as :func:`sub`, but return a tuple ``(new_string,
  177. number_of_subs_made)``.
  178. """
  179. pass
  180. def escape(string):
  181. """
  182. Return *string* with all non-alphanumerics backslashed; this is useful if you
  183. want to match an arbitrary literal string that may have regular expression
  184. metacharacters in it.
  185. """
  186. pass
  187. def purge():
  188. """
  189. Clear the regular expression cache.
  190. """
  191. pass
  192. class RegexObject:
  193. """
  194. The :class:`RegexObject` class supports the following methods and attributes:
  195. """
  196. def __init__(self, ):
  197. pass
  198. def search(self, string,pos,endpos):
  199. """
  200. Scan through *string* looking for a location where this regular expression
  201. produces a match, and return a corresponding :class:`MatchObject` instance.
  202. Return ``None`` if no position in the string matches the pattern; note that this
  203. is different from finding a zero-length match at some point in the string.
  204. The optional second parameter *pos* gives an index in the string where the
  205. search is to start; it defaults to ``0``. This is not completely equivalent to
  206. slicing the string; the ``'^'`` pattern character matches at the real beginning
  207. of the string and at positions just after a newline, but not necessarily at the
  208. index where the search is to start.
  209. The optional parameter *endpos* limits how far the string will be searched; it
  210. will be as if the string is *endpos* characters long, so only the characters
  211. from *pos* to ``endpos - 1`` will be searched for a match. If *endpos* is less
  212. than *pos*, no match will be found, otherwise, if *rx* is a compiled regular
  213. expression object, ``rx.search(string, 0, 50)`` is equivalent to
  214. ``rx.search(string[:50], 0)``.
  215. >>> pattern = re.compile("d")
  216. >>> pattern.search("dog") # Match at index 0
  217. <_sre.SRE_Match object at more>
  218. >>> pattern.search("dog", 1) # No match; search doesn't include the "d"
  219. """
  220. pass
  221. def match(self, string,pos,endpos):
  222. """
  223. If zero or more characters at the *beginning* of *string* match this regular
  224. expression, return a corresponding :class:`MatchObject` instance. Return
  225. ``None`` if the string does not match the pattern; note that this is different
  226. from a zero-length match.
  227. The optional *pos* and *endpos* parameters have the same meaning as for the
  228. :meth:`~RegexObject.search` method.
  229. """
  230. pass
  231. def split(self, string,maxsplit=0):
  232. """
  233. Identical to the :func:`split` function, using the compiled pattern.
  234. """
  235. pass
  236. def findall(self, string,pos,endpos):
  237. """
  238. Similar to the :func:`findall` function, using the compiled pattern, but
  239. also accepts optional *pos* and *endpos* parameters that limit the search
  240. region like for :meth:`match`.
  241. """
  242. pass
  243. def finditer(self, string,pos,endpos):
  244. """
  245. Similar to the :func:`finditer` function, using the compiled pattern, but
  246. also accepts optional *pos* and *endpos* parameters that limit the search
  247. region like for :meth:`match`.
  248. """
  249. pass
  250. def sub(self, repl,string,count=0):
  251. """
  252. Identical to the :func:`sub` function, using the compiled pattern.
  253. """
  254. pass
  255. def subn(self, repl,string,count=0):
  256. """
  257. Identical to the :func:`subn` function, using the compiled pattern.
  258. """
  259. pass
  260. class MatchObject:
  261. """
  262. Match Objects always have a boolean value of :const:`True`, so that you can test
  263. whether e.g. :func:`match` resulted in a match with a simple if statement. They
  264. support the following methods and attributes:
  265. """
  266. def __init__(self, ):
  267. pass
  268. def expand(self, template):
  269. """
  270. Return the string obtained by doing backslash substitution on the template
  271. string *template*, as done by the :meth:`~RegexObject.sub` method. Escapes
  272. such as ``\n`` are converted to the appropriate characters, and numeric
  273. backreferences (``\1``, ``\2``) and named backreferences (``\g<1>``,
  274. ``\g<name>``) are replaced by the contents of the corresponding group.
  275. """
  276. pass
  277. def group(self, group1,more):
  278. """
  279. Returns one or more subgroups of the match. If there is a single argument, the
  280. result is a single string; if there are multiple arguments, the result is a
  281. tuple with one item per argument. Without arguments, *group1* defaults to zero
  282. (the whole match is returned). If a *groupN* argument is zero, the corresponding
  283. return value is the entire matching string; if it is in the inclusive range
  284. [1..99], it is the string matching the corresponding parenthesized group. If a
  285. group number is negative or larger than the number of groups defined in the
  286. pattern, an :exc:`IndexError` exception is raised. If a group is contained in a
  287. part of the pattern that did not match, the corresponding result is ``None``.
  288. If a group is contained in a part of the pattern that matched multiple times,
  289. the last match is returned.
  290. >>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
  291. >>> m.group(0) # The entire match
  292. 'Isaac Newton'
  293. >>> m.group(1) # The first parenthesized subgroup.
  294. 'Isaac'
  295. >>> m.group(2) # The second parenthesized subgroup.
  296. 'Newton'
  297. >>> m.group(1, 2) # Multiple arguments give us a tuple.
  298. ('Isaac', 'Newton')
  299. If the regular expression uses the ``(?P<name>more)`` syntax, the *groupN*
  300. arguments may also be strings identifying groups by their group name. If a
  301. string argument is not used as a group name in the pattern, an :exc:`IndexError`
  302. exception is raised.
  303. A moderately complicated example:
  304. >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
  305. >>> m.group('first_name')
  306. 'Malcolm'
  307. >>> m.group('last_name')
  308. 'Reynolds'
  309. Named groups can also be referred to by their index:
  310. >>> m.group(1)
  311. 'Malcolm'
  312. >>> m.group(2)
  313. 'Reynolds'
  314. If a group matches multiple times, only the last match is accessible:
  315. >>> m = re.match(r"(..)+", "a1b2c3") # Matches 3 times.
  316. >>> m.group(1) # Returns only the last match.
  317. 'c3'
  318. """
  319. pass
  320. def groups(self, default):
  321. """
  322. Return a tuple containing all the subgroups of the match, from 1 up to however
  323. many groups are in the pattern. The *default* argument is used for groups that
  324. did not participate in the match; it defaults to ``None``. (Incompatibility
  325. note: in the original Python 1.5 release, if the tuple was one element long, a
  326. string would be returned instead. In later versions (from 1.5.1 on), a
  327. singleton tuple is returned in such cases.)
  328. For example:
  329. >>> m = re.match(r"(\d+)\.(\d+)", "24.1632")
  330. >>> m.groups()
  331. ('24', '1632')
  332. If we make the decimal place and everything after it optional, not all groups
  333. might participate in the match. These groups will default to ``None`` unless
  334. the *default* argument is given:
  335. >>> m = re.match(r"(\d+)\.?(\d+)?", "24")
  336. >>> m.groups() # Second group defaults to None.
  337. ('24', None)
  338. >>> m.groups('0') # Now, the second group defaults to '0'.
  339. ('24', '0')
  340. """
  341. pass
  342. def groupdict(self, default):
  343. """
  344. Return a dictionary containing all the *named* subgroups of the match, keyed by
  345. the subgroup name. The *default* argument is used for groups that did not
  346. participate in the match; it defaults to ``None``. For example:
  347. >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
  348. >>> m.groupdict()
  349. {'first_name': 'Malcolm', 'last_name': 'Reynolds'}
  350. """
  351. pass
  352. def start(self, group):
  353. """MatchObject.end([group])
  354. Return the indices of the start and end of the substring matched by *group*;
  355. *group* defaults to zero (meaning the whole matched substring). Return ``-1`` if
  356. *group* exists but did not contribute to the match. For a match object *m*, and
  357. a group *g* that did contribute to the match, the substring matched by group *g*
  358. (equivalent to ``m.group(g)``) is ::
  359. m.string[m.start(g):m.end(g)]
  360. Note that ``m.start(group)`` will equal ``m.end(group)`` if *group* matched a
  361. null string. For example, after ``m = re.search('b(c?)', 'cba')``,
  362. ``m.start(0)`` is 1, ``m.end(0)`` is 2, ``m.start(1)`` and ``m.end(1)`` are both
  363. 2, and ``m.start(2)`` raises an :exc:`IndexError` exception.
  364. An example that will remove *remove_this* from email addresses:
  365. >>> email = "tony@tiremove_thisger.net"
  366. >>> m = re.search("remove_this", email)
  367. >>> email[:m.start()] + email[m.end():]
  368. 'tony@tiger.net'
  369. """
  370. pass
  371. def span(self, group):
  372. """
  373. For :class:`MatchObject` *m*, return the 2-tuple ``(m.start(group),
  374. m.end(group))``. Note that if *group* did not contribute to the match, this is
  375. ``(-1, -1)``. *group* defaults to zero, the entire match.
  376. """
  377. pass