期刊
EXPERT SYSTEMS WITH APPLICATIONS
卷 168, 期 -, 页码 -出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.114299
关键词
Multi-view clustering; Differential evolution; Multi-objective optimization; Textual entailment; Word mover distance; Universal sentence encoder
类别
资金
- Early Career Research Award of Science and Engineering Research Board (SERB) of Department of Science and Technology India
In this study, the Search Results Clustering problem is treated as a multi-view clustering problem and solved through optimization. Various views based on syntactic and semantic similarity measures are considered, with three new views incorporated in the framework. Experimental results show that the proposed approach outperforms existing techniques.
Search Results Clustering (SRC) is a well-known problem in the field of information retrieval and refers to the clustering of web-snippets for a given query based on some similarity/dissimilarity measure. In this current study, we have posed Search Results Clustering problem as a multi-view clustering problem and solved it from an optimization point of view. Various views based on syntactic and semantic similarity measures were considered while performing the clustering. In contrast to existing algorithms, three new views based on word mover distance, textual-entailment, and universal sentence encoder, measuring semantics while performing clustering, are incorporated in our framework. Different quality measures computed on clusters generated by different views are optimized simultaneously using multi-objective binary differential evolution (MBDE) framework. MBDE comprises a set of solutions and each solution is composed of two parts corresponding to different views. An agreement index checking the accordance between partitionings of different views is also optimized to obtain a consensus partitioning. The proposed approach is automatic in nature as it is capable of detecting the number of clusters for any query in an automatic way. Experiments are performed on three benchmark multi-view datasets corresponding to web search results and evaluated using well-known F-measure metric. Results obtained illustrate that our approach outperforms state-of-the-art techniques.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据