#! | 11 lines | 9 code | 2 blank | 0 comment | 0 complexity | eefba0b94e4a824ad71de708ff53b4cc MD5 | raw file
Possible License(s): Apache-2.0
- This is a pretty hard document, a scanned, warped historical newspaper
- page. It's mostly here as a test case to see how we can improve OCRopus
- in the future.
- The script illustrates how to adjust the layout analysis parameters
- in ocropus-gpageseg for these kinds of documents. Note that there are
- some layout analysis errors.
- Better character recognition performance will require retraining models
- on historical books and newspaper prints (the current models are trained
- on modern scanned documents only).