☆ 4.7 Article

An extended attention mechanism for scene text recognition

EXPERT SYSTEMS WITH APPLICATIONS (2022)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 203, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2022.117377

关键词

Scene text recognition; Attention on attention; Deep neural network; Encoder-decoder framework

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

National Natural Science Foundation of China [61872129]
Natural Science Foundation of Hunan Province, China [2019JJ40024]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes an extended attention-based framework for scene text recognition tasks. By introducing the Attention on Attention (AoA) mechanism, the relevance between attention results and queries can be determined, improving the accuracy of recognition. Experimental results show that the proposed method outperforms other benchmarks on multiple datasets.

Scene text recognition (STR) refers to obtaining text information from natural text images. The task is more challenging than the optical character recognition(OCR) due to the variability of scenes. Attention mechanism, which assigns different weights to each feature vector at each time step, guides the text recognition decoding process. However, when the given query and the key/value are not related, the generated attention result will contain irrelevant information, which could lead the model to give wrong results. In this paper, we propose an extended attention-based framework for STR tasks. In particular, we have integrated an extended attention mechanism named Attention on Attention (AoA), which is able to determine the relevance between attention results and queries, into both the encoder and the decoder of a common text recognition framework. By two separate linear functions, the AoA module generates an information vector and an attention gate using the attention result and the current context. Then AoA adds new attention by applying element-wise multiplication to acquire final attended information. Our method is compared with seven benchmarks over eight datasets. Experimental results show that our method outperforms all the seven benchmarks, by 6.7% and 1.4% than the worst and best works on average.

An extended attention mechanism for scene text recognition

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An extended attention mechanism for scene text recognition

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文