4.6 Article

Dense and Tight Detection of Chinese Characters in Historical Documents: Datasets and a Recognition Guided Detector

期刊

IEEE ACCESS
卷 6, 期 -, 页码 30174-30183

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2018.2840218

关键词

Historical documents; character detection; recognition guided detector; data sets

资金

  1. National Key Research and Development Program of China [0 2016YFB1001405]
  2. NSFC [61472144, 61673182, 61771199]
  3. GD-NSF Grant [2017A030312006]
  4. GDSTP Grant [2017A030312006, 2015B010101004]
  5. GZSTP Grant [201607010227]

向作者/读者索取更多资源

Characters in historical documents are typically densely distributed and are difficult to localize and segment by directly applying classic proposal and regression based methods. In this paper, we propose a novel method called recognition guided detector (RGD) that achieves tight Chinese character detection in historical documents. The proposed RGD consists of two simultaneously trained convolutional neural networks: a recognition guided proposal network that provides context information of the text and a detection network that applies this information to localize each of the characters accurately. To train and test the proposed method, we established two new datasets with character-level annotations, comprising ground truth character bounding boxes and ground truth characters in each of the boxes. The data in our datasets are scanned images collected from nine different versions of Tripitaka in Han. Experimental results show that, guided by a text recognition network with a test accuracy of 97.25%, the detection network in our proposed method achieves a much higher F-score with fewer parameters under a highly constrained evaluation criterion of intersection of union (IoU) >=, 0.7, when comparing to several state-of-the-art object detection and text detection methods. The datasets are publicly available at https://github.com/HCIILAB/TKH_MTH_Datasets_Release for non-commercial use.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据