期刊
PATTERN RECOGNITION
卷 35, 期 4, 页码 875-893出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/S0031-3203(01)00081-4
关键词
Devanagari script; character/text recognition; prototype construction; character fusion; character fragmentation; character segmentation/decomposition
Devanagari script is a two dimensional composition of symbols. It is highly cumbersome to treat each composite character as a separate atomic symbol because such combinations are very large in number. This paper presents a two pass algorithm for the segmentation and decomposition of Devanagari composite characters/symbols into their constituent symbols. The proposed algorithm extensively uses structural proper-ties of the script. In the first pass, words are segmented into easily separable characters/composite characters. Statistical information about the height and width of each separated box is used to hypothesize whether a character box is composite. In the second pass, the hypothesized composite characters are further segmented. A recognition rate of 85 percent has been achieved on the segmented conjuncts. The algorithm is designed to segment a pair of touching characters. (0 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据