☆ 4.7 Article

MASTER: Multi-aspect non-local network for scene text recognition

PATTERN RECOGNITION (2021)

期刊

PATTERN RECOGNITION

卷 117, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2021.107980

关键词

Scene text recognition; Transformer; Non-local network; Memory-cached mechanism

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The MASTER, a self-attention based scene text recognizer, addresses attention-drift issue by incorporating self-attention, leading to improved efficiency and inference speed.

Attention-based scene text recognizers have gained huge success, which leverages a more compact in-termediate representation to learn 1d-or 2d-attention by a RNN-based encoder-decoder architecture. However, such methods suffer from attention-drift problem because high similarity among encoded features leads to attention confusion under the RNN-based local attention mechanism. Moreover, RNN-based methods have low efficiency due to poor parallelization. To overcome these problems, we propose the MASTER, a self-attention based scene text recognizer that (1) not only encodes the input-output at-tention but also learns self-attention which encodes feature-feature and target-target relationships inside the encoder and decoder and (2) learns a more powerful and robust intermediate representation to spa-tial distortion, and (3) owns a great training efficiency because of high training parallelization and a high-speed inference because of an efficient memory-cache mechanism. Extensive experiments on var-ious benchmarks demonstrate the superior performance of our MASTER on both regular and irregular scene text. Pytorch code can be found at https://github.com/wenwenyu/MASTER-pytorch, and Tensorflow code can be found at https://github.com/jiangxiluning/MASTER-TF . (c) 2021 Elsevier Ltd. All rights reserved.

MASTER: Multi-aspect non-local network for scene text recognition

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

MASTER: Multi-aspect non-local network for scene text recognition

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文