4.5 Article

COSUM: Text summarization based on clustering and optimization

期刊

EXPERT SYSTEMS
卷 36, 期 1, 页码 -

出版社

WILEY
DOI: 10.1111/exsy.12340

关键词

adaptive differential evolution algorithm; content coverage; harmonic mean; information diversity; k-means; optimization model; sentence clustering; text summarization

向作者/读者索取更多资源

Text summarization is a process of extracting salient information from a source text and presenting that information to the user in a condensed form while preserving its main content. In the text summarization, most of the difficult problems are providing wide topic coverage and diversity in a summary. Research based on clustering, optimization, and evolutionary algorithm for text summarization has recently shown good results, making this a promising area. In this paper, for a text summarization, a two-stage sentences selection model based on clustering and optimization techniques, called COSUM, is proposed. At the first stage, to discover all topics in a text, the sentences set is clustered by using k-means method. At the second stage, for selection of salient sentences from clusters, an optimization model is proposed. This model optimizes an objective function that expressed as a harmonic mean of the objective functions enforcing the coverage and diversity of the selected sentences in the summary. To provide readability of a summary, this model also controls length of sentences selected in the candidate summary. For solving the optimization problem, an adaptive differential evolution algorithm with novel mutation strategy is developed. The method COSUM was compared with the 14 state-of-the-art methods: DPSO-EDASum; LexRank; CollabSum; UnifiedRank; 0-1 non-linear; query, cluster, summarize; support vector machine; fuzzy evolutionary optimization model; conditional random fields; MA-SingleDocSum; NetSum; manifold ranking; ESDS-GHS-GLO; and differential evolution, using ROUGE tool kit on the DUC2001 and DUC2002 data sets. Experimental results demonstrated that COSUM outperforms the state-of-the-art methods in terms of ROUGE-1 and ROUGE-2 measures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据