☆ 4.5 Article

QBSUM: A large-scale query-based document summarization dataset from real-world applications

COMPUTER SPEECH AND LANGUAGE (2021)

期刊

COMPUTER SPEECH AND LANGUAGE

卷 66, 期 -, 页码 -

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

DOI: 10.1016/j.csl.2020.101166

关键词

Query-based summarization; Natural language generation; Information retrieval

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Query-based document summarization is important for extracting relevant information from documents based on search queries. The lack of publicly available Chinese datasets in this field led to the creation of QBSUM, a high-quality large-scale dataset with proposed solutions to improve performance. This dataset aims to advance research in this area for future developments and applications.

Query-based document summarization aims to extract or generate a summary of a document which directly answers or is relevant to the search query. It is an important technique that can be beneficial to a variety of applications such as search engines, document-level machine reading comprehension, and chatbots. Currently, datasets designed for query-based summarization are short in numbers and existing datasets are also limited in both scale and quality. Moreover, to the best of our knowledge, there is no publicly available dataset for Chinese query-based document summarization. In this paper, we present QBSUM, a high-quality large-scale dataset consisting of 49,000+ data samples for the task of Chinese query-based document summarization. We also propose multiple unsupervised and supervised solutions to the task and demonstrate their high-speed inference and superior performance via both offline experiments and online A/B tests. The QBSUM dataset is released in order to facilitate future advancement of this research field. (C) 2020 Elsevier Ltd. All rights reserved.

QBSUM: A large-scale query-based document summarization dataset from real-world applications

期刊

COMPUTER SPEECH AND LANGUAGE

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

QBSUM: A large-scale query-based document summarization dataset from real-world applications

期刊

COMPUTER SPEECH AND LANGUAGE

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文