☆ 4.7 Article

Efficient CRNN: Towards end-to-end low resource Urdu text recognition using depthwise separable convolutions and gated recurrent units

INFORMATION PROCESSING & MANAGEMENT (2024)

期刊

INFORMATION PROCESSING & MANAGEMENT

卷 61, 期 1, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2023.103544

关键词

Artificial neural networks; Deep learning; Corpus generation; Image processing; Optical character recognition; Text recognition; Urdu OCR

类别

Computer Science, Information Systems Information Science & Library Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this study, a novel technique called Efficient CRNN is proposed for printed text recognition in Urdu language. The technique is computationally efficient and achieves better performance compared to existing techniques. A multi-font printed Urdu text lines corpus is also presented to train and evaluate the proposed technique. The Efficient CRNN achieved impressive results and outperforms existing complex techniques and Vision Transformer-based network.

In this study, a novel technique is proposed to recognize printed text in images for Urdu, a low-resource language with a scarcity of benchmark datasets. The proposed technique is called Efficient CRNN which uses depthwise separable convolutional and bidirectional gated recurrent unit layers, followed by connectionist temporal classification loss. The proposed technique is computationally more efficient than the existing text recognition techniques, requiring fewer parameters and computations. A multi-font printed Urdu text lines corpus is also presented, consisting of 245,000 text line images rendered using 7 different fonts. The corpus is called the MMU-Extension-22 and is used to train and evaluate existing state-of-the-art end-to-end text recognition techniques. Efficient CRNN is also evaluated using the proposed corpus. The proposed technique is first trained using a total of 196,000 text line images and then tested using 49,000 images. The Efficient CRNN technique achieved the minimum character and word error rates of 0.91% and 1.49% respectively for Urdu text line recognition under different settings, outperforming the existing computationally more complex techniques. The simple nature of the proposed technique not only makes it more efficient but also more robust for Urdu text line recognition, achieving a 2.23% reduced character error rate and a 71%1 decrease in character error rate as compared to the best performing existing Recurrent Neural Networks based technique. Also, the proposed technique outperforms Vision Transformer-based network achieving a 0.79% reduced character error rate accounting for a 41% decrease in error. Also, the Efficient CRNN has 49.16% reduced parameters compared to the baseline Vision Transformer technique.

Efficient CRNN: Towards end-to-end low resource Urdu text recognition using depthwise separable convolutions and gated recurrent units

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Efficient CRNN: Towards end-to-end low resource Urdu text recognition using depthwise separable convolutions and gated recurrent units

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文