☆ 4.6 Article

An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms

NEURAL COMPUTING & APPLICATIONS (2021)

Journal

NEURAL COMPUTING & APPLICATIONS

Volume 33, Issue 1, Pages 449-468

Publisher

SPRINGER LONDON LTD

DOI: 10.1007/s00521-020-04981-w

Keywords

Character segmentation; Handwritten word; Bangla script; Image database; Word recognition

Funding

PURSE-II, Jadavpur University
UPE-II, Jadavpur University
DST, Govt. of India [EMR/2016/007213]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Recognition of unconstrained handwritten word images is a challenging research problem, especially when lexicon-free words are considered. The development of a comprehensive word recognition module requires a competent character segmentation technique. However, the lack of standard word image databases with ground truth information results in most character segmentation algorithms relying on self-made databases with manual evaluation. A comprehensive database of handwritten Bangla word images has been prepared in this study to evaluate character segmentation algorithms, along with two types of ground truth images related to segmented character shapes. The benchmark result shows that the developed database outperforms some state-of-the-art methods with an F-score of 0.9212.

Recognition of unconstrained handwritten word images is an interesting research problem which gets more challenging when lexicon-free words are considered. Prerequisite for developing a lexicon-free handwritten word recognition technique is the segmentation of a word image into its constituent character set. Therefore, a competent character segmentation technique is required to design a comprehensive word recognition module. However, the literature study reveals that there is no standard word image database with ground truth information. As a result, most character segmentation algorithms found in the literature rely on self-made databases with manual evaluation. To fill the research need, in the present scope of the work, a comprehensive database consisting of handwritten Bangla word images is prepared primarily for evaluating any character segmentation algorithms. Additionally, the present work also provides two types of ground truth images related to segmented character shapes of the word images. Besides, an evaluation tool is developed for assessing the performance of any character segmentation algorithm on the developed benchmark database. The benchmark result, as found here, is 0.9212 (F-score) which outperforms some state-of-the-art methods.

An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms

Journal

NEURAL COMPUTING & APPLICATIONS

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms

Journal

NEURAL COMPUTING & APPLICATIONS

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper