期刊
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II
卷 12822, 期 -, 页码 778-792出版社
SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-030-86331-9_50
关键词
Document collection; Visual Question Answering
类别
资金
- UAB PIF scholarship [B18P0070, 2017-SGR-1783]
- University Department of the Catalan Government
Current methods in Document Understanding focus on processing individual documents, while documents are typically organized in collections which provide valuable context for interpretation. To address this issue, DocCVQA introduces a new dataset and task where questions are posed over a whole collection of document images, aiming to provide answers to questions and retrieve the documents containing relevant information. Along with the dataset, a new evaluation metric and baselines are proposed to gain further insights into this new dataset and task.
Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据