PageRenderTime 26ms CodeModel.GetById 20ms RepoModel.GetById 0ms app.codeStats 0ms

#! | 11 lines | 9 code | 2 blank | 0 comment | 0 complexity | eefba0b94e4a824ad71de708ff53b4cc MD5 | raw file
Possible License(s): Apache-2.0
  1. This is a pretty hard document, a scanned, warped historical newspaper
  2. page. It's mostly here as a test case to see how we can improve OCRopus
  3. in the future.
  4. The script illustrates how to adjust the layout analysis parameters
  5. in ocropus-gpageseg for these kinds of documents. Note that there are
  6. some layout analysis errors.
  7. Better character recognition performance will require retraining models
  8. on historical books and newspaper prints (the current models are trained
  9. on modern scanned documents only).