4.5 Article

Transformer-based approach for joint handwriting and named entity recognition in historical document

Journal

PATTERN RECOGNITION LETTERS
Volume 155, Issue -, Pages 128-134

Publisher

ELSEVIER
DOI: 10.1016/j.patrec.2021.11.010

Keywords

Named-entity-recognition; Text block recognition; Transformer; IEHHR competition

Ask authors/readers for more resources

This paper proposes an end-to-end transformer-based approach to jointly perform text transcription and named entity recognition tasks in handwriting documents. The approach operates at the paragraph level, avoiding early errors and improving prediction accuracy. Different training scenarios and a two-stage learning strategy are explored and shown to be effective.
The extraction of relevant information carried out by named entities in handwriting documents is still a challenging task. Unlike traditional information extraction approaches that usually face text transcription and named entity recognition as separate subsequent tasks, we propose in this paper an end-to-end transformer-based approach to jointly perform these two tasks. The proposed approach operates at the paragraph level, which brings two main benefits. First, it allows the model to avoid unrecoverable early errors due to line segmentation. Second, it allows the model to exploit larger bi-dimensional context information to identify the semantic categories, reaching a higher final prediction accuracy. We also explore different training scenarios to show their effect on the performance and we demonstrate that a two-stage learning strategy can make the model reach a higher final prediction accuracy. As far as we know, this work presents the first approach that adopts the transformer networks for named entity recognition in handwritten documents. We achieve the new state-of-the-art performance in the ICDAR 2017 Information Extraction competition using the Esposalles database, for the complete task, even though the proposed technique does not use any dictionaries, language modeling, or post-processing. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available