3.8 Proceedings Paper

Document Collection Visual Question Answering

期刊

出版社

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-030-86331-9_50

关键词

Document collection; Visual Question Answering

资金

  1. UAB PIF scholarship [B18P0070, 2017-SGR-1783]
  2. University Department of the Catalan Government

向作者/读者索取更多资源

Current methods in Document Understanding focus on processing individual documents, while documents are typically organized in collections which provide valuable context for interpretation. To address this issue, DocCVQA introduces a new dataset and task where questions are posed over a whole collection of document images, aiming to provide answers to questions and retrieve the documents containing relevant information. Along with the dataset, a new evaluation metric and baselines are proposed to gain further insights into this new dataset and task.
Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据