4.6 Article

An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms

期刊

NEURAL COMPUTING & APPLICATIONS
卷 33, 期 1, 页码 449-468

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s00521-020-04981-w

关键词

Character segmentation; Handwritten word; Bangla script; Image database; Word recognition

资金

  1. PURSE-II, Jadavpur University
  2. UPE-II, Jadavpur University
  3. DST, Govt. of India [EMR/2016/007213]

向作者/读者索取更多资源

Recognition of unconstrained handwritten word images is a challenging research problem, especially when lexicon-free words are considered. The development of a comprehensive word recognition module requires a competent character segmentation technique. However, the lack of standard word image databases with ground truth information results in most character segmentation algorithms relying on self-made databases with manual evaluation. A comprehensive database of handwritten Bangla word images has been prepared in this study to evaluate character segmentation algorithms, along with two types of ground truth images related to segmented character shapes. The benchmark result shows that the developed database outperforms some state-of-the-art methods with an F-score of 0.9212.
Recognition of unconstrained handwritten word images is an interesting research problem which gets more challenging when lexicon-free words are considered. Prerequisite for developing a lexicon-free handwritten word recognition technique is the segmentation of a word image into its constituent character set. Therefore, a competent character segmentation technique is required to design a comprehensive word recognition module. However, the literature study reveals that there is no standard word image database with ground truth information. As a result, most character segmentation algorithms found in the literature rely on self-made databases with manual evaluation. To fill the research need, in the present scope of the work, a comprehensive database consisting of handwritten Bangla word images is prepared primarily for evaluating any character segmentation algorithms. Additionally, the present work also provides two types of ground truth images related to segmented character shapes of the word images. Besides, an evaluation tool is developed for assessing the performance of any character segmentation algorithm on the developed benchmark database. The benchmark result, as found here, is 0.9212 (F-score) which outperforms some state-of-the-art methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据