4.6 Article

Weakly supervised precise segmentation for historical document images

Journal

NEUROCOMPUTING
Volume 350, Issue -, Pages 271-281

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2019.04.001

Keywords

Weakly supervised learning; Recognition-guided; Historical document images segmentation

Funding

  1. National Key Research and Development Program of China [2016YFB1001405]
  2. GD-NSF [2017A030312006]
  3. NSFC [61673182, 61771199]
  4. GDSTP [2017A010101027]

Ask authors/readers for more resources

With the passing of history, precious cultural heritage was left behind to tell ancient stories, especially those in the form of written documents. In this paper, a weakly supervised segmentation system with recognition-guided information on attention area, is proposed for high-precision historical document segmentation under strict intersection-over-union (IoU) requirements. We formulate the character segmentation problem from Bayesian decision theory perspective and propose boundary box segmentation (BBS), recognition-guided BBS (Rg-BBS), and recognition-guided attention BBS (Rg-ABBS), progressively, to search for the segmentation path. Furthermore, a novel judgment gate mechanism is proposed to train a high-performance character recognizer in an incremental weakly supervised learning manner. The proposed Rg-ABBS method is shown to substantially reduce time consumption while maintaining sufficiently high precision of the segmentation result by incorporating both character recognition knowledge and line-level annotation. Experiments show that the proposed Rg-ABBS system significantly outperforms traditional segmentation methods as well as deep-learning-based instance segmentation and detection methods under strict IoU requirements. (C) 2019 The Author(s). Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available