4.7 Article

Student Network Learning via Evolutionary Knowledge Distillation

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2021.3090902

Keywords

Training; Knowledge representation; Knowledge transfer; Predictive models; Germanium; Data models; Data mining; Knowledge distillation; teacher-student learning; deep learning

Funding

  1. National Key Research and Development Plan [2020AAA0140001]
  2. National Natural Science Foundation of China [61772513]
  3. Beijing Natural Science Foundation [L192040]
  4. Beijing Municipal Science and Technology Commission [Z191100007119002]
  5. Open Research Project of the State Key Laboratory of Media Convergence and Communication, Communication University of China [SKLMCC2020KF004]

Ask authors/readers for more resources

Knowledge distillation is an effective way of transferring knowledge through teacher-student learning. However, traditional methods often result in a large capability gap between the teacher and student networks. Recent research has shown that a small capability gap can facilitate knowledge transfer. In this paper, we propose an evolutionary knowledge distillation approach that learns an evolving teacher to improve the effectiveness of knowledge transfer. We introduce simple guided modules to enhance intermediate knowledge representation and mimicking. Extensive experiments demonstrate the effectiveness and adaptability of our approach in low-resolution and few-sample scenarios.
Knowledge distillation provides an effective way to transfer knowledge via teacher-student learning, where most existing distillation approaches apply a fixed pre-trained model as teacher to supervise the learning of student network. This manner usually brings in a big capability gap between teacher and student networks during learning. Recent researches have observed that a small teacher-student capability gap can facilitate knowledge transfer. Inspired by that, we propose an evolutionary knowledge distillation approach to improve the transfer effectiveness of teacher knowledge. Instead of a fixed pre-trained teacher, an evolutionary teacher is learned online and consistently transfers intermediate knowledge to supervise student network learning on-the-fly. To enhance intermediate knowledge representation and mimicking, several simple guided modules are introduced between corresponding teacher-student blocks. In this way, the student can simultaneously obtain rich internal knowledge and capture its growth process, leading to effective student network learning. Extensive experiments clearly demonstrate the effectiveness of our approach as well as good adaptability in the low-resolution and few-sample scenarios.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available