- Yuan-Yuan Lv, Yong-Li Deng, Ming-Liang Liu, Qi-Yong Lu: Automatic Error Checking and Correction of Electronic Medical Records. FSDM 2015: 32-40
下载要钱
摘要
-
Tn this paper, an effective error checking and correction method of or Chinese medical records recognized by OCR is proposed. In our research, an optimized N-gram language model based on vocabulary rather than words is adopted to correct errors, and supervised machine learning based on maximum entropy (MaxEnt) is deployed to build a model for tokenization and named entity recognition. A medical knowledge base (MKB) is established, including dictionaries of medicine, symptoms, diseases, etc., and the frequency of each word as it appeared in the study corpus. Furthermore a Knowledge Base for Error correction (KBE) is built to automatically correct high-frequency errors. With the developed approach, the accuracy rate of the electronic medical record increases from 85.20% to 95.72%, indicating an error reduction of 71.08%.
本文提出了一种有效的错误检查和纠正方法或OCR认可的中国病历。 在我们的研究中,采用基于词汇而非词语的优化N-gram语言模型来纠正错误,并且基于最大熵(MaxEnt)的监督机器学习被部署以构建用于标记化和命名实体识别的模型。 建立医学知识库(MKB),包括医学词典,症状,疾病等,以及研究语料库中出现的每个单词的频率。 此外,还构建了用于纠错的知识库(KBE),以自动纠正高频错误。 随着开发的方法,电子病历的准确率从85.20%增加到95.72%,表明错误减少了71.08%。