4.7 Article

Contrastive Graph Convolutional Networks with adaptive augmentation for text classification

Journal

INFORMATION PROCESSING & MANAGEMENT
Volume 59, Issue 4, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ipm.2022.102946

Keywords

Text classification; Graph Neural Networks; Graph contrastive learning; Data augmentation

Funding

  1. National Natural Science Foundation of China [61872161, 61976103]
  2. Foundation of the National Key Research and Development of China [2021ZD0112500]
  3. Nature Science Foundation of Jilin Province [20200201297JC]
  4. Foundation of Development and Reform of Jilin Province [2019C053-8]
  5. Foundation of Jilin Educational Committee [JJKH20191257KJ]
  6. Interdisciplinary and integrated innovation of JLU [JLUXKJC2020207]
  7. Fundamental Research Funds for the Central Universities

Ask authors/readers for more resources

This study introduces a new graph-based text classification model CGA2TC, which utilizes contrastive learning and adaptive augmentation strategy for more robust node representation. By exploring word co-occurrence and document word relationships to construct a text graph, diverse augmentation strategies are employed to address noise issues. Consistency training is applied on labeled and unlabeled nodes, demonstrating the effectiveness of the model in text classification tasks.
Text classification is an important research topic in natural language processing (NLP), and Graph Neural Networks (GNNs) have recently been applied in this task. However, in existing graph-based models, text graphs constructed by rules are not real graph data and introduce massive noise. More importantly, for fixed corpus-level graph structure, these models cannot sufficiently exploit the labeled and unlabeled information of nodes. Meanwhile, contrastive learning has been developed as an effective method in graph domain to fully utilize the information of nodes. Therefore, we propose a new graph-based model for text classification named CGA2TC, which introduces contrastive learning with an adaptive augmentation strategy into obtaining more robust node representation. First, we explore word co-occurrence and document word relationships to construct a text graph. Then, we design an adaptive augmentation strategy for the text graph with noise to generate two contrastive views that effectively solve the noise problem and preserve essential structure. Specifically, we design noise-based and centrality-based augmentation strategies on the topological structure of text graph to disturb the unimportant connections and thus highlight the relatively important edges. As for the labeled nodes, we take the nodes with same label as multiple positive samples and assign them to anchor node, while we employ consistency training on unlabeled nodes to constrain model predictions. Finally, to reduce the resource consumption of contrastive learning, we adopt a random sample method to select some nodes to calculate contrastive loss. The experimental results on several benchmark datasets can demonstrate the effectiveness of CGA2TC on the text classification task.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available