☆ 4.7 Article

Improving knowledge distillation via an expressive teacher

KNOWLEDGE-BASED SYSTEMS (2021)

期刊

KNOWLEDGE-BASED SYSTEMS

卷 218, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.knosys.2021.106837

关键词

Neural network compression; Knowledge distillation; Knowledge transfer

类别

Computer Science, Artificial Intelligence

资金

National Key Research and Development Program of China [2018YFB0204301]
National Key Research and Development Program, China [2017YFB0202104]
National Natural Science Foundation of China [61806213]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Knowledge distillation is a network compression technique where a teacher network guides a student network to mimic its behavior. This study explores how to train as a good teacher, proposing inter-class correlation regularization. Experimental results show that this method achieves good performance in image classification tasks.

Knowledge distillation (KD) is a widely used network compression technique for seeking a light student network with similar behaviors to its heavy teacher network. Previous studies mainly focus on training the student to mimic representation space of the teacher. However, how to be a good teacher is rarely explored. We find that if a teacher has weak ability to capture the knowledge underlying the true data in the real world, the student cannot even learn knowledge from its teacher. Inspired by that, we propose an inter-class correlation regularization to train teacher to capture a more explicit correlation among classes. Besides, we enforce student to mimic inter-class correlation of its teacher. Extensive experiments of image classification task have been conducted on four public benchmarks. For example, when the teacher and student networks are ShuffleNetV2-1.0 and ShuffleNetV2-0.5, our proposed method achieves 42.63% top-1 error rate for Tiny ImageNet. (C) 2021 Elsevier B.V. All rights reserved.

Improving knowledge distillation via an expressive teacher

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving knowledge distillation via an expressive teacher

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文