4.7 Article

End-to-end table structure recognition and extraction in heterogeneous documents

期刊

APPLIED SOFT COMPUTING
卷 123, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.asoc.2022.108942

关键词

Object detection; Document structure recognition; Table detection; Table structure recognition; Table extraction

向作者/读者索取更多资源

This paper presents a model based on deep neural networks that can automatically detect and convert tables into an editable or searchable format. By combining computer vision and machine learning, this model facilitates document digitization and extraction of data for decision-making in fields like healthcare and finance.
Automatically detecting and parsing tables into an indexable and searchable format is an important problem in document digitization. It relates to computer vision, machine learning, and optical character recognition. This paper presents a simple model based on a deep neural network architecture that combines recent advances in computer vision and machine learning, which can be used to detect and convert a table into a format that can be edited or searched. The motivation for this work is to develop a sound method to extract the vast data set of knowledge available in physical documents such that it can be used to develop data-driven tools that can be used to support decisions in fields such as healthcare and finance. The model uses a Yolo-based object detector trained to maximize the Intersection over Union of the detected table regions within the document image and a novel OCR-based algorithm to parse the table from each table detected in the document.Past works have all focused on documents and images containing a level and even tables. This paper aims to present our findings after the model is run on a set of skewed image datasets. Experiments on the Marmot and Publaynet datasets show that the proposed method is entirely accurate and can generalize different tables formats. At an Intersection over the Union threshold of 50%, we achieve a mean Average Precision (mAP) of 98% and an average IoU of 88.81% on the PubLayNet dataset. With the same IoU threshold, we achieve an mAP of 95.07% and an average IoU of 75.57% on the Marmot dataset. (c) 2022 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据