3.8 Proceedings Paper

One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition

出版社

IEEE COMPUTER SOC
DOI: 10.1109/WACV51458.2022.00262

关键词

-

资金

  1. Swedish Research Council [2018-06074]
  2. Spanish project [RTI2018-095645B-C21]
  3. CERCA Program/Generalitat de Catalunya
  4. AGAUR [2019PROD00090]
  5. UAB [B18P0073]
  6. [PID2020-116298GB-I00]

向作者/读者索取更多资源

This paper addresses the challenge of low-resource Handwritten Text Recognition (HTR) by proposing a data generation technique based on Bayesian Program Learning (BPL). Unlike traditional methods, which require a large amount of annotated images, our method can generate human-like handwriting using only one sample of each symbol in the alphabet. Synthetic lines are then created to train state-of-the-art HTR architectures in a segmentation-free fashion. Quantitative and qualitative analyses confirm the effectiveness of the proposed method.
Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models). For example, in the case of historical ciphered manuscripts, which are usually written with invented alphabets to hide the message contents. Thus, in this paper we address this problem through a data generation technique based on Bayesian Program Learning (BPL). Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol in the alphabet. After generating symbols, we create synthetic lines to train state-of-the-art HTR architectures in a segmentation free fashion. Quantitative and qualitative analyses were carried out and confirm the effectiveness of the proposed method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据