PageRenderTime 21ms CodeModel.GetById 13ms app.highlight 5ms RepoModel.GetById 1ms app.codeStats 0ms

/Lib/test/crashers/bogus_sre_bytecode.py

http://unladen-swallow.googlecode.com/
Python | 47 lines | 46 code | 0 blank | 1 comment | 0 complexity | 338573131dc76d72de050db06551f632 MD5 | raw file
 1"""
 2The regular expression engine in '_sre' can segfault when interpreting
 3bogus bytecode.
 4
 5It is unclear whether this is a real bug or a "won't fix" case like
 6bogus_code_obj.py, because it requires bytecode that is built by hand,
 7as opposed to compiled by 're' from a string-source regexp.  The
 8difference with bogus_code_obj, though, is that the only existing regexp
 9compiler is written in Python, so that the C code has no choice but
10accept arbitrary bytecode from Python-level.
11
12The test below builds and runs random bytecodes until 'match' crashes
13Python.  I have not investigated why exactly segfaults occur nor how
14hard they would be to fix.  Here are a few examples of 'code' that
15segfault for me:
16
17    [21, 50814, 8, 29, 16]
18    [21, 3967, 26, 10, 23, 54113]
19    [29, 23, 0, 2, 5]
20    [31, 64351, 0, 28, 3, 22281, 20, 4463, 9, 25, 59154, 15245, 2,
21                  16343, 3, 11600, 24380, 10, 37556, 10, 31, 15, 31]
22
23Here is also a 'code' that triggers an infinite uninterruptible loop:
24
25    [29, 1, 8, 21, 1, 43083, 6]
26
27"""
28
29import _sre, random
30
31def pick():
32    n = random.randrange(-65536, 65536)
33    if n < 0:
34        n &= 31
35    return n
36
37ss = ["", "world", "x" * 500]
38
39while 1:
40    code = [pick() for i in range(random.randrange(5, 25))]
41    print code
42    pat = _sre.compile(None, 0, code)
43    for s in ss:
44        try:
45            pat.match(s)
46        except RuntimeError:
47            pass