|国家预印本平台
首页|GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field

GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field

GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field

来源:Arxiv_logoArxiv
英文摘要

Implicit neural representations for video have been recognized as a novel and promising form of video representation. Existing works pay more attention to improving video reconstruction quality but little attention to the decoding speed. However, the high computation of convolutional network used in existing methods leads to low decoding speed. Moreover, these convolution-based video representation methods also suffer from long training time, about 14 seconds per frame to achieve 35+ PSNR on Bunny. To solve the above problems, we propose GSVR, a novel 2D Gaussian-based video representation, which achieves 800+ FPS and 35+ PSNR on Bunny, only needing a training time of $2$ seconds per frame. Specifically, we propose a hybrid deformation field to model the dynamics of the video, which combines two motion patterns, namely the tri-plane motion and the polynomial motion, to deal with the coupling of camera motion and object motion in the video. Furthermore, we propose a Dynamic-aware Time Slicing strategy to adaptively divide the video into multiple groups of pictures(GOP) based on the dynamic level of the video in order to handle large camera motion and non-rigid movements. Finally, we propose quantization-aware fine-tuning to avoid performance reduction after quantization and utilize image codecs to compress Gaussians to achieve a compact representation. Experiments on the Bunny and UVG datasets confirm that our method converges much faster than existing methods and also has 10x faster decoding speed compared to other methods. Our method has comparable performance in the video interpolation task to SOTA and attains better video compression performance than NeRV.

Zhizhuo Pang、Zhihui Ke、Xiaobo Zhou、Tie Qiu

计算技术、计算机技术

Zhizhuo Pang,Zhihui Ke,Xiaobo Zhou,Tie Qiu.GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field[EB/OL].(2025-07-08)[2025-08-02].https://arxiv.org/abs/2507.05594.点此复制

评论