|国家预印本平台
| 注册
首页|LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression

LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression

LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression

来源:Arxiv_logoArxiv
英文摘要

With the rapid development of computer vision, Vision Transformers (ViTs) offer the tantalising prospect of unified information processing across visual and textual domains due to the lack of inherent inductive biases in ViTs. ViTs require enormous datasets for training. We introduce an innovative ensemble-based distillation approach that distils inductive bias from complementary lightweight teacher models to make their applications practical. Prior systems relied solely on convolution-based teaching. However, this method incorporates an ensemble of light teachers with different architectural tendencies, such as convolution and involution, to jointly instruct the student transformer. Because of these unique inductive biases, instructors can accumulate a wide range of knowledge, even from readily identifiable stored datasets, which leads to enhanced student performance. Our proposed framework LIB-KD also involves precomputing and keeping logits in advance, essentially the unnormalized predictions of the model. This optimisation can accelerate the distillation process by eliminating the need for repeated forward passes during knowledge distillation, significantly reducing the computational burden and enhancing efficiency.

Gousia Habib、Tausifa Jan Saleem、Ishfaq Ahmad Malik、Brejesh Lall

计算技术、计算机技术

Gousia Habib,Tausifa Jan Saleem,Ishfaq Ahmad Malik,Brejesh Lall.LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression[EB/OL].(2025-08-21)[2025-09-05].https://arxiv.org/abs/2310.00369.点此复制

评论