4.8 Article

Intention-guided deep semi-supervised document clustering via metric learning

Publisher

ELSEVIER
DOI: 10.1016/j.jksuci.2022.12.010

Keywords

Intention; Semi -supervised; Clustering; Metric learning

Ask authors/readers for more resources

This paper proposes an intention-guided deep semi-supervised document clustering model called IGSC. IGSC uses a deep metric learning network to address the limitations of traditional deep semi-supervised clustering models and utilizes an intention matrix to guide the clustering process, resulting in improved clustering performance that aligns with the user's intention.
The intention expresses the user's preference for document structure division. Intention-guided document structure division is an important task in the field of text mining. To achieve this goal, deep semi-supervised document clustering provides a promising solution to personalized document clustering. However, traditional deep semi-supervised clustering models suffer from the problem of the limited number of constraints which is insufficient for intention-guided document clustering. Moreover, documents normally have various emphases on their representations to reflect different structural opinions. In this paper, we proposed an intention-guided deep semi-supervised document clustering model, namely IGSC, to divide document structure based on a small amount of user-provided supervised information. IGSC designs a deep metric learning network to solve the above problems. The deep metric learner explores the user's global intention and outputs an intention matrix. The intention is explored from the small amount user provided pairwise constraints and is used to guide the representation learning. Moreover, IGSC uses the intention matrix to guide the clustering process, to get the clustering results that best meet the user's intention. This paper compares IGSC with a number of document clustering models on four real-world text datasets, namely Reu-10k, BBC, ACM, and Abstract. The results show that IGSC evidently improves the clustering performance and outperforms the best result of benchmark models with 7% on average. The comparison with other models and the visualization results can demonstrate that IGSC is effective.& COPY; 2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available