首页|Linear Attention Modeling for Learned Image Compression

Linear Attention Modeling for Learned Image Compression

来源：

英文摘要

Recent years, learned image compression has made tremendous progress to achieve impressive coding efficiency. Its coding gain mainly comes from non-linear neural network-based transform and learnable entropy modeling. However, most studies focus on a strong backbone, and few studies consider a low complexity design. In this paper, we propose LALIC, a linear attention modeling for learned image compression. Specially, we propose to use Bi-RWKV blocks, by utilizing the Spatial Mix and Channel Mix modules to achieve more compact feature extraction, and apply the Conv based Omni-Shift module to adapt to two-dimensional latent representation. Furthermore, we propose a RWKV-based Spatial-Channel ConTeXt model (RWKV-SCCTX), that leverages the Bi-RWKV to modeling the correlation between neighboring features effectively. To our knowledge, our work is the first work to utilize efficient Bi-RWKV models with linear attention for learned image compression. Experimental results demonstrate that our method achieves competitive RD performances by outperforming VTM-9.1 by -15.26%, -15.41%, -17.63% in BD-rate on Kodak, CLIC and Tecnick datasets. The code is available at https://github.com/sjtu-medialab/RwkvCompress .

作者：Shen Wang、Ronghua Wu、Zhengxue Cheng、Donghui Feng、Guo Lu、Hongwei Hu、Li Song

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Shen Wang,Ronghua Wu,Zhengxue Cheng,Donghui Feng,Guo Lu,Hongwei Hu,Li Song.Linear Attention Modeling for Learned Image Compression[EB/OL].(2025-02-08)[2025-05-18].https://arxiv.org/abs/2502.05741.点此复制

Linear Attention Modeling for Learned Image Compression

Linear Attention Modeling for Learned Image Compression

评论