|国家预印本平台
首页|Efficient Visual Representation Learning with Heat Conduction Equation

Efficient Visual Representation Learning with Heat Conduction Equation

Efficient Visual Representation Learning with Heat Conduction Equation

来源:Arxiv_logoArxiv
英文摘要

Foundation models, such as CNNs and ViTs, have powered the development of image representation learning. However, general guidance to model architecture design is still missing. Inspired by the connection between image representation learning and heat conduction, we model images by the heat conduction equation, where the essential idea is to conceptualize image features as temperatures and model their information interaction as the diffusion of thermal energy. Based on this idea, we find that many modern model architectures, such as residual structures, SE block, and feed-forward networks, can be interpreted from the perspective of the heat conduction equation. Therefore, we leverage the heat equation to design new and more interpretable models. As an example, we propose the Heat Conduction Layer and the Refinement Approximation Layer inspired by solving the heat conduction equation using Finite Difference Method and Fourier series, respectively. The main goal of this paper is to integrate the overall architectural design of neural networks into the theoretical framework of heat conduction. Nevertheless, our Heat Conduction Network (HcNet) still shows competitive performance, e.g., HcNet-T achieves 83.0% top-1 accuracy on ImageNet-1K while only requiring 28M parameters and 4.1G MACs. The code is publicly available at: https://github.com/ZheminZhang1/HcNet.

Zhemin Zhang、Xun Gong

物理学计算技术、计算机技术

Zhemin Zhang,Xun Gong.Efficient Visual Representation Learning with Heat Conduction Equation[EB/OL].(2024-08-11)[2025-08-03].https://arxiv.org/abs/2408.05901.点此复制

评论