4.7 Article

Alleviating the over-smoothing of graph neural computing by a data augmentation strategy with entropy preservation

Journal

PATTERN RECOGNITION
Volume 132, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2022.108951

Keywords

Graph representation; Graph convolutional networks; Information theory; Graph entropy

Funding

  1. Research and Development Pro- gram of China [2018AAA0101100]
  2. National Natural Science Foundation of China [62141605, 62050132]
  3. Beijing Natural Science Foundation [1192012, Z180 0 05]

Ask authors/readers for more resources

This paper introduces a novel graph entropy definition to evaluate the smoothness of a data manifold and proposes a strategy to generate randomly perturbed training data while preserving both graph topology and graph entropy. Experimental results demonstrate the effectiveness of the method in improving semi-supervised node classification accuracy and enhancing the robustness of the training process for GCN.
The Graph Convolutional Networks (GCN) proposed by Kipf and Welling is an effective model to im-prove semi-supervised learning of pattern recognition, but faces the obstacle of over-smoothing, which will weaken the representation ability of GCN. Recently some works are proposed to tackle above limi-tation by randomly perturbing graph topology or feature matrix to generate data augmentations as input for training. However, these operations inevitably do damage to the integrity of information structures and have to sacrifice the smoothness of feature manifold. In this paper, we first introduce a novel graph entropy definition as a measure to quantitatively evaluate the smoothness of a data manifold and then point out that this graph entropy is controlled by triangle motif-based information structures. Consider-ing the preservation of graph entropy, we propose an effective strategy to generate randomly perturbed training data but maintain both graph topology and graph entropy. Extensive experiments have been conducted on real-world datasets and the results verify the effectiveness of our proposed method in im-proving semi-supervised node classification accuracy compared with a surge of baselines. Beyond that, our proposed approach could significantly enhance the robustness of training process for GCN.(c) 2022 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available