4.7 Article

Contrastive Graph Convolutional Networks with adaptive augmentation for text classification

期刊

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.ipm.2022.102946

关键词

Text classification; Graph Neural Networks; Graph contrastive learning; Data augmentation

资金

  1. National Natural Science Foundation of China [61872161, 61976103]
  2. Foundation of the National Key Research and Development of China [2021ZD0112500]
  3. Nature Science Foundation of Jilin Province [20200201297JC]
  4. Foundation of Development and Reform of Jilin Province [2019C053-8]
  5. Foundation of Jilin Educational Committee [JJKH20191257KJ]
  6. Interdisciplinary and integrated innovation of JLU [JLUXKJC2020207]
  7. Fundamental Research Funds for the Central Universities , JLU

向作者/读者索取更多资源

CGA2TC is a new graph-based model for text classification that combines contrastive learning and adaptive augmentation strategy to obtain more robust node representation. It constructs a text graph using word co-occurrence and document word relationships and designs an augmentation strategy to solve the noise problem and preserve essential structures. The model handles labeled and unlabeled nodes differently and adopts random sampling to reduce resource consumption. Experimental results demonstrate the effectiveness of CGA2TC in text classification tasks.
Text classification is an important research topic in natural language processing (NLP), and Graph Neural Networks (GNNs) have recently been applied in this task. However, in existing graph-based models, text graphs constructed by rules are not real graph data and introduce massive noise. More importantly, for fixed corpus-level graph structure, these models cannot sufficiently exploit the labeled and unlabeled information of nodes. Meanwhile, contrastive learning has been developed as an effective method in graph domain to fully utilize the information of nodes. Therefore, we propose a new graph-based model for text classification named CGA2TC, which introduces contrastive learning with an adaptive augmentation strategy into obtaining more robust node representation. First, we explore word co-occurrence and document word relationships to construct a text graph. Then, we design an adaptive augmentation strategy for the text graph with noise to generate two contrastive views that effectively solve the noise problem and preserve essential structure. Specifically, we design noise-based and centrality-based augmentation strategies on the topological structure of text graph to disturb the unimportant connections and thus highlight the relatively important edges. As for the labeled nodes, we take the nodes with same label as multiple positive samples and assign them to anchor node, while we employ consistency training on unlabeled nodes to constrain model predictions. Finally, to reduce the resource consumption of contrastive learning, we adopt a random sample method to select some nodes to calculate contrastive loss. The experimental results on several benchmark datasets can demonstrate the effectiveness of CGA2TC on the text classification task.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据