期刊
INFORMATION PROCESSING & MANAGEMENT
卷 57, 期 6, 页码 -出版社
ELSEVIER SCI LTD
DOI: 10.1016/j.ipm.2020.102264
关键词
Automatic text summarization; Extractive text summarization; Graph representation model; Single-document summarization; EdgeSumm
Searching the Internet for a certain topic can become a daunting task because users cannot read and comprehend all the resulting texts. Automatic Text summarization (ATS) in this case is clearly beneficial because manual summarization is expensive and time-consuming. To enhance ATS for single documents, this paper proposes a novel extractive graph-based framework EdgeSumm that relies on four proposed algorithms. The first algorithm constructs a new text graph model representation from the input document. The second and third algorithms search the constructed text graph for sentences to be included in the candidate summary. When the resulting candidate summary still exceeds a user-required limit, the fourth algorithm is used to select the most important sentences. EdgeSumm combines a set of extractive ATS methods (namely graph-based, statistical-based, semantic-based, and centrality-based methods) to benefit from their advantages and overcome their individual drawbacks. EdgeSumm is general for any document genre (not limited to a specific domain) and unsupervised so it does not require any training data. The standard datasets DUC2001 and DUC2002 are used to evaluate EdgeSumm using the widely used automatic evaluation tool: Recall-Oriented Understudy for Gisting Evaluation (ROUGE). EdgeSumm gets the highest ROUGE scores on DUC2001. For DUC2002, the evaluation results show that the proposed framework outperforms the state-of-the-art ATS systems by achieving improvements of 1.2% and 4.7% over the highest scores in the literature for the metrics of ROUGE-1 and ROUGE-L respectively. In addition, EdgeSumm achieves very competitive results for the metrics of ROUGE-2 and ROUGE-SU4.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据