PageRenderTime 42ms CodeModel.GetById 18ms RepoModel.GetById 1ms app.codeStats 0ms

/python26/Lib/test/crashers/bogus_sre_bytecode.py

https://bitbucket.org/archer256/hgserve/
Python | 47 lines | 46 code | 0 blank | 1 comment | 0 complexity | b979c989fdddcb7162bb904b5c24f680 MD5 | raw file
Possible License(s): BSD-3-Clause
  1. """
  2. The regular expression engine in '_sre' can segfault when interpreting
  3. bogus bytecode.
  4. It is unclear whether this is a real bug or a "won't fix" case like
  5. bogus_code_obj.py, because it requires bytecode that is built by hand,
  6. as opposed to compiled by 're' from a string-source regexp. The
  7. difference with bogus_code_obj, though, is that the only existing regexp
  8. compiler is written in Python, so that the C code has no choice but
  9. accept arbitrary bytecode from Python-level.
  10. The test below builds and runs random bytecodes until 'match' crashes
  11. Python. I have not investigated why exactly segfaults occur nor how
  12. hard they would be to fix. Here are a few examples of 'code' that
  13. segfault for me:
  14. [21, 50814, 8, 29, 16]
  15. [21, 3967, 26, 10, 23, 54113]
  16. [29, 23, 0, 2, 5]
  17. [31, 64351, 0, 28, 3, 22281, 20, 4463, 9, 25, 59154, 15245, 2,
  18. 16343, 3, 11600, 24380, 10, 37556, 10, 31, 15, 31]
  19. Here is also a 'code' that triggers an infinite uninterruptible loop:
  20. [29, 1, 8, 21, 1, 43083, 6]
  21. """
  22. import _sre, random
  23. def pick():
  24. n = random.randrange(-65536, 65536)
  25. if n < 0:
  26. n &= 31
  27. return n
  28. ss = ["", "world", "x" * 500]
  29. while 1:
  30. code = [pick() for i in range(random.randrange(5, 25))]
  31. print code
  32. pat = _sre.compile(None, 0, code)
  33. for s in ss:
  34. try:
  35. pat.match(s)
  36. except RuntimeError:
  37. pass