|国家预印本平台
首页|Towards Universal & Efficient Model Compression via Exponential Torque Pruning

Towards Universal & Efficient Model Compression via Exponential Torque Pruning

Towards Universal & Efficient Model Compression via Exponential Torque Pruning

来源:Arxiv_logoArxiv
英文摘要

The rapid growth in complexity and size of modern deep neural networks (DNNs) has increased challenges related to computational costs and memory usage, spurring a growing interest in efficient model compression techniques. Previous state-of-the-art approach proposes using a Torque-inspired regularization which forces the weights of neural modules around a selected pivot point. Whereas, we observe that the pruning effect of this approach is far from perfect, as the post-trained network is still dense and also suffers from high accuracy drop. In this work, we attribute such ineffectiveness to the default linear force application scheme, which imposes inappropriate force on neural module of different distances. To efficiently prune the redundant and distant modules while retaining those that are close and necessary for effective inference, in this work, we propose Exponential Torque Pruning (ETP), which adopts an exponential force application scheme for regularization. Experimental results on a broad range of domains demonstrate that, though being extremely simple, ETP manages to achieve significantly higher compression rate than the previous state-of-the-art pruning strategies with negligible accuracy drop.

Sarthak Ketanbhai Modi、Zi Pong Lim、Shourya Kuchhal、Yushi Cao、Yupeng Cheng、Yon Shin Teo、Shang-Wei Lin、Zhiming Li

计算技术、计算机技术

Sarthak Ketanbhai Modi,Zi Pong Lim,Shourya Kuchhal,Yushi Cao,Yupeng Cheng,Yon Shin Teo,Shang-Wei Lin,Zhiming Li.Towards Universal & Efficient Model Compression via Exponential Torque Pruning[EB/OL].(2025-07-03)[2025-07-21].https://arxiv.org/abs/2506.22015.点此复制

评论