- Elvis Koci, Maik Thiele, Wolfgang Lehner, Oscar Romero: Table Recognition in Spreadsheets via a Graph Representation. DAS 2018: 139-144 [CCF B]
问题备注
- 通过图形表示识别电子表格中的表格识别
摘要
-
abstract—spreadsheet software are very popular data management tools.
电子表格软件是非常流行的数据管理工具。
-
their ease of use and abundant functionalities equip novices and professionals alike with the means to generate, transform, analyze, and visualize data.
它们的易用性和丰富的功能为新手和专业人员提供了生成、转换、分析和可视化数据的手段。
-
as a result, spreadsheets are a great resource of factual and structured information.
因此,电子表格是大量的事实信息和结构化信息资源。
-
this accentuates the need to automatically understand and extract their contents.
这就强调了自动理解和提取其内容的必要性。
-
in this paper, we present a novel approach for recognizing tables in spreadsheets.
本文提出了一种电子表格中表格识别的新方法。
-
having inferred the layout role of the individual cells, we build layout regions.
在推导出单个单元格的布局角色之后,我们构建布局区域。
-
we encode the spatial interrelations between these regions using a graph representation. based on this, we propose remove and conquer (rac), an algorithm for table recognition that implements a list of carefully curated rules.
我们使用一个图表示对这些区域之间的空间相互关系进行编码。在此基础上,我们提出了一种表识别算法RAC(RAC),它实现了一系列精心编排的规则。
-
an extensive experimental evaluation shows that our approach is viable. we achieve significant accuracy in a dataset of real spreadsheets from various domains
一次广泛的实验评价表明,我们的方法是可行的。我们在来自不同领域的真实电子表格数据集中获得了显著的准确性。