4.7 Article

Self-Consistent Graph Neural Networks for Semi-Supervised Node Classification

Journal

IEEE TRANSACTIONS ON BIG DATA
Volume 9, Issue 4, Pages 1186-1197

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TBDATA.2023.3266590

Keywords

Graph neural networks; Representation learning; Deep learning; self-consistent constraint; semi-supervised learning; node classification

Ask authors/readers for more resources

Graph Neural Networks (GNNs), based on deep learning, have attracted research interest. Many GNNs have achieved state-of-the-art accuracy but lack supervision information for unlabeled data. To address this, we propose SCGNN which extracts self-supervision information from unlabeled nodes and utilizes label information from labeled nodes. Experimental results show that SCGNN outperforms baselines, improving accuracy by an average of 2.08% and by 5.8% on the Disease dataset.
Graph Neural Networks (GNNs), the powerful graph representation technique based on deep learning, have attracted great research interest in recent years. Although many GNNs have achieved the state-of-the-art accuracy on a set of standard benchmark datasets, they are still limited to traditional semi-supervised framework and lack of sufficient supervision information, especially for the large amount of unlabeled data. To overcome this issue, we propose a novel self-consistent graph neural networks (SCGNN) framework to enrich the supervision information from two aspects: the self-consistency of unlabeled data and the label information of labeled data. First, in order to extract the self-supervision information from the numerous unlabeled nodes, we perform graph data augmentation and leverage a self-consistent constraint to maximize the mutual information of the unlabeled nodes across different augmented graph views. The self-consistency can sufficiently utilize the intrinsic structural attributes of the graph to extract the self-supervision information from unlabeled data and improve the subsequent classification result. Second, to further extract supervision information from scarce labeled nodes, we introduce a fusion mechanism to obtain comprehensive node embeddings by fusing node representations of two positive graph views, and optimize the classification loss over labeled nodes to maximize the utilization of label information. We conduct comprehensive empirical studies on six public benchmark datasets in node classification task. In terms of accuracy, SCGNN improves by an average of 2.08% over the best baseline, and specifically by 5.8% on the Disease dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available