Adaptive Edit-Distance and Regression Approach for Post-OCR Text Correction

摘要

  • Post-processing is a crucial step in improving the performance of OCR process. In this paper, we present a novel approach which explores a modified way of candidate generating and candidate scoring at character level as well as word level. These features are combined with some important features suggested by related work for ranking candidates in a regression model. The experimental results show that our approach has comparable results with the top performing approaches in the Post-OCR text correction competition ICDAR 2017.

    后处理是提高OCR过程性能的关键步骤。 在本文中,我们提出了一种新的方法,探索了在字符级别和单词级别上候选生成和候选评分的修改方式。 这些特征与相关工作建议的一些重要特征相结合,用于在回归模型中对候选者进行排序。 实验结果表明,我们的方法与后OCR文本校正竞赛ICDAR 2017中表现最佳的方法具有可比性。

打赏一个呗

取消

感谢您的支持,我会继续努力的!

扫码支持
扫码支持
扫码打赏,你说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦