首页|The Role of Teacher Calibration in Knowledge Distillation

The Role of Teacher Calibration in Knowledge Distillation

来源：

英文摘要

Knowledge Distillation (KD) has emerged as an effective model compression technique in deep learning, enabling the transfer of knowledge from a large teacher model to a compact student model. While KD has demonstrated significant success, it is not yet fully understood which factors contribute to improving the student's performance. In this paper, we reveal a strong correlation between the teacher's calibration error and the student's accuracy. Therefore, we claim that the calibration of the teacher model is an important factor for effective KD. Furthermore, we demonstrate that the performance of KD can be improved by simply employing a calibration method that reduces the teacher's calibration error. Our algorithm is versatile, demonstrating effectiveness across various tasks from classification to detection. Moreover, it can be easily integrated with existing state-of-the-art methods, consistently achieving superior performance.

作者：Suyoung Kim、Seonguk Park、Junhoo Lee、Nojun Kwak

作者单位：

DOI：10.1109/ACCESS.2025.3585106

学科分类：计算技术、计算机技术

推荐引用：Suyoung Kim,Seonguk Park,Junhoo Lee,Nojun Kwak.The Role of Teacher Calibration in Knowledge Distillation[EB/OL].(2025-08-27)[2025-09-06].https://arxiv.org/abs/2508.20224.点此复制

The Role of Teacher Calibration in Knowledge Distillation

The Role of Teacher Calibration in Knowledge Distillation

评论