☆ 4.7 Article

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 33, 期 1, 页码 366-377

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2020.3027837

关键词

Knowledge engineering; Shape; Learning systems; Neural networks; Feature extraction; Knowledge transfer; Task analysis; Deep neural network (DNN); knowledge transfer; multitask learning; smaller network

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

Institute of Information & communications Technology Planning & Evaluation (IITP) - Korea government (MSIT) [2020-0-01389]
Industrial Technology Innovation Program - Ministry of Trade, Industry and Energy (MI, Korea) (Development of human-friendly human-robot interaction technologies using human internal emotional states) [10073154]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Knowledge distillation is a method to improve the performance of a student network by transferring knowledge from a teacher network. The proposed method transfers knowledge independently of the spatial shape of the teacher's feature map using singular value decomposition. Additionally, a multitask learning method is presented to effectively adjust the teacher's constraints to the student's learning speed. Experimental results show significant improvements on different datasets.

Knowledge distillation (KD) from a teacher neural network and transfer of the knowledge to a small student network is done to improve the performance of the student network. This method is one of the most popular techniques to lighten convolutional neural networks (CNNs). Many KD algorithms have been proposed recently, but they still cannot properly distill essential knowledge of the teacher network, and the transfer tends to depend on the spatial shape of the teacher's feature map. To solve these problems, we propose a method to transfer knowledge independently of the spatial shape of the teacher's feature map, which is major information obtained by decomposing the feature map through singular value decomposition (SVD). In addition, we present a multitask learning method that enables the student to learn the teacher's knowledge effectively by adaptively adjusting the teacher's constraints to the student's learning speed. Experimental results show that the proposed method performs 2.37% better on the CIFAR100 data set and 2.89% better on the TinyImageNet data set than the state-of-the-art method. The source code is publicly available at https://github.com/sseung0703/KD_methods_with_TF.

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文