☆ 3.8 Proceedings Paper

Text Recognition - Real World Data and Where to Find Them

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) (2021)

Journal

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)

Volume -, Issue -, Pages 4489-4496

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/ICPR48806.2021.9412868

Keywords

Funding

Czech Technical University [SGS20/171/OHK3/3T/13]
MEYS VVV project [CZ.02.1.01/0.0/0.0/16 019/0000765]
Spanish Research project [TIN2017-89779-P]
CERCA Programme/Generalitat de Catalunya

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The method proposed leverages weakly annotated images to enhance text extraction pipelines, by combining imprecise text transcriptions with weak annotations to generate nearly error-free instances of scene text for training, resulting in consistent improvements in accuracy for state-of-the-art recognition models.

We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as pseudo ground truth (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets.

Text Recognition - Real World Data and Where to Find Them

Journal

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Text Recognition - Real World Data and Where to Find Them

Journal

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper