PageRenderTime 55ms CodeModel.GetById 20ms RepoModel.GetById 0ms app.codeStats 0ms

/scripts/ob_export_remove_invalid_dashes.py

https://github.com/wangmxf/lesswrong
Python | 28 lines | 20 code | 7 blank | 1 comment | 1 complexity | f9ade595d7f5e84f5b26e6a7b56c8a16 MD5 | raw file
Possible License(s): MPL-2.0-no-copyleft-exception, LGPL-2.1
  1. #!/usr/bin/env python
  2. import sys
  3. import re
  4. def main():
  5. infilename, outfilename = sys.argv[1:3]
  6. infile = open(infilename)
  7. buf = infile.read()
  8. buf = buf.decode('utf-8')
  9. re_invalid_dashes = re.compile(ur'^-----(---)?(\r)?\n(?!--------|BODY:|EXTENDED BODY:|EXCERPT:|KEYWORDS:|AUTHOR:|COMMENT:|PING:|\Z)', re.MULTILINE)
  10. buf = re_invalid_dashes.sub(ur'----\n', buf)
  11. re_control_chars = re.compile(ur'(\s)[]')
  12. buf = re_control_chars.sub(ur'\1', buf)
  13. re_control_chars = re.compile(ur'[](\s)')
  14. buf = re_control_chars.sub(ur'\1', buf)
  15. re_control_chars = re.compile(ur'[]')
  16. buf = re_control_chars.sub(ur' ', buf)
  17. buf = buf.encode('utf-8')
  18. outfile = open(outfilename, 'w')
  19. outfile.write(buf)
  20. if __name__ == '__main__':
  21. main()