|国家预印本平台
首页|PanoWan: Lifting Diffusion Video Generation Models to 360{\deg} with Latitude/Longitude-aware Mechanisms

PanoWan: Lifting Diffusion Video Generation Models to 360{\deg} with Latitude/Longitude-aware Mechanisms

PanoWan: Lifting Diffusion Video Generation Models to 360{\deg} with Latitude/Longitude-aware Mechanisms

来源:Arxiv_logoArxiv
英文摘要

Panoramic video generation enables immersive 360{\deg} content creation, valuable in applications that demand scene-consistent world exploration. However, existing panoramic video generation models struggle to leverage pre-trained generative priors from conventional text-to-video models for high-quality and diverse panoramic videos generation, due to limited dataset scale and the gap in spatial feature representations. In this paper, we introduce PanoWan to effectively lift pre-trained text-to-video models to the panoramic domain, equipped with minimal modules. PanoWan employs latitude-aware sampling to avoid latitudinal distortion, while its rotated semantic denoising and padded pixel-wise decoding ensure seamless transitions at longitude boundaries. To provide sufficient panoramic videos for learning these lifted representations, we contribute PanoVid, a high-quality panoramic video dataset with captions and diverse scenarios. Consequently, PanoWan achieves state-of-the-art performance in panoramic video generation and demonstrates robustness for zero-shot downstream tasks.

Yifei Xia、Shuchen Weng、Siqi Yang、Jingqi Liu、Chengxuan Zhu、Minggui Teng、Zijian Jia、Han Jiang、Boxin Shi

计算技术、计算机技术

Yifei Xia,Shuchen Weng,Siqi Yang,Jingqi Liu,Chengxuan Zhu,Minggui Teng,Zijian Jia,Han Jiang,Boxin Shi.PanoWan: Lifting Diffusion Video Generation Models to 360{\deg} with Latitude/Longitude-aware Mechanisms[EB/OL].(2025-05-28)[2025-06-14].https://arxiv.org/abs/2505.22016.点此复制

评论