Table Recognition in Spreadsheets via a Graph Representation

问题备注

  • 通过图形表示识别电子表格中的表格识别

摘要

  • abstract—spreadsheet software are very popular data management tools.

    电子表格软件是非常流行的数据管理工具。

  • their ease of use and abundant functionalities equip novices and professionals alike with the means to generate, transform, analyze, and visualize data.

    它们的易用性和丰富的功能为新手和专业人员提供了生成、转换、分析和可视化数据的手段。

  • as a result, spreadsheets are a great resource of factual and structured information.

    因此,电子表格是大量的事实信息和结构化信息资源。

  • this accentuates the need to automatically understand and extract their contents.

    这就强调了自动理解和提取其内容的必要性。

  • in this paper, we present a novel approach for recognizing tables in spreadsheets.

    本文提出了一种电子表格中表格识别的新方法。

  • having inferred the layout role of the individual cells, we build layout regions.

    在推导出单个单元格的布局角色之后,我们构建布局区域。

  • we encode the spatial interrelations between these regions using a graph representation. based on this, we propose remove and conquer (rac), an algorithm for table recognition that implements a list of carefully curated rules.

    我们使用一个图表示对这些区域之间的空间相互关系进行编码。在此基础上,我们提出了一种表识别算法RAC(RAC),它实现了一系列精心编排的规则。

  • an extensive experimental evaluation shows that our approach is viable. we achieve significant accuracy in a dataset of real spreadsheets from various domains

    一次广泛的实验评价表明,我们的方法是可行的。我们在来自不同领域的真实电子表格数据集中获得了显著的准确性。

所用模型

实验

结论

启发

参考文献

打赏一个呗

取消

感谢您的支持,我会继续努力的!

扫码支持
扫码支持
扫码打赏,你说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦