☆ 4.7 Article

SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents

PATTERN RECOGNITION (2020)

期刊

PATTERN RECOGNITION

卷 108, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2020.107559

关键词

Document analysis and recognition; Handwritten Chinese text recognition; Handwritten Chinese text detection; Benchmark dataset

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

National Nature Science Foundation of China [61936003, 61872151]
Natural Science Foundation of Guangdong Province [2017A030312006, 2019A1515011045]
National Key Research and Development Program of China [2016YFB1001405]
Fundamental Research Funds for the Central Universities [x2dxD2190570, 2019MS023]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In this paper, we introduce a large-scale dataset, called SCUT-HCCDoc, to address challenging detection and recognition problems of handwritten Chinese text (HCT) in the camera-captured documents. Despite extensive studies of optical character recognition (OCR) and offline handwriting recognition for document images, text detection and recognition in the camera-captured documents remains an unsolved problem that is worth for extensive study and investigation. With recent advances in deep learning, researchers have proposed useful architectures for feature learning, detection, and recognition for the scene text. However, the performance of deep learning methods highly depends on the amount and diversity of training data. Previous OCR and offline HCT datasets were built under specific constraints, and most of the recent scene text datasets are for non-handwritten text. Hence, there is a lack of a comprehensive scene handwritten text benchmark. This study focuses on scenes with handwritten Chinese text. We introduce the SCUT-HCCDoc database for HCT detection, recognition and spotting. SCUT-HCCDoc contains 12,253 camera-captured document images with 116,629 text lines and 1,155,801 characters. The diversity of SCUT-HCCDoc can be described at three levels: (1) image-level diversity : image appearance and geometric variances caused by camera-captured settings (such as perspective, background, and resolution) and different applications (such as note-taking, test papers, and homework); (2) text-level diversity : variances of text line length, rotation, etc.; (3) character-level diversity : variances of character categories (up to 6109 classes with additional English letters, and digits), character size, individual writing style, etc. Three kinds of baseline experiments were conducted, where we used several popular text detection methods for text line detection, CTC-based/attention-based methods for text line recognition, and combine text detectors with CTC-based recognizer to achieve end-to-end text spotting. The results indicate the diversity of SCUT-HCCDoc and the challenges of HCT understanding in document images. The dataset is available at https://github.com/HCIILAB/SCUT-HCCDoc_Dataset_Release . (c) 2020 Elsevier Ltd. All rights reserved.

SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文