4.6 Article

BINYAS: a complex document layout analysis system

期刊

MULTIMEDIA TOOLS AND APPLICATIONS
卷 80, 期 6, 页码 8471-8504

出版社

SPRINGER
DOI: 10.1007/s11042-020-09832-3

关键词

Document layout analysis; BINYAS; Text non-text separation; Non-text classification; Inverted text; RDCL

向作者/读者索取更多资源

Document layout analysis (DLA) is essential for developing a comprehensive document image processing system, aiming to segment document images and identify different regions. The proposed BINYAS system, based on connected components and pixel analysis, outperforms existing methods based on evaluations on four standard datasets.
Document layout analysis (DLA) is an irreplaceable pre-requisite for the development of a comprehensive document image processing and analysis system. The main purpose of DLA is to segment an input document image into its constituent and coherent regions and identify their classes. In this paper, we propose a competent DLA system, named as BINYAS, based on the connected component (CC) and pixel analysis based approach. Here, we initially identify the regions and then classify these regions as paragraph, separator, graphic, image, table, chart, and inverted text etc. The proposed system is evaluated on four publicly available standard datasets, namely ICDAR 2009, 2015, 2017 and 2019 page segmentation competition datasets, and the performance is compared with many contemporary methods, which also include some well-known software products and deep learning based methods. Experimental results show that our method performs significantly better than state-of-the-art methods in terms of the evaluation metrics considered by the research community of this domain.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据