首页|AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation

AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation

来源：

英文摘要

Single-image-to-3D models typically follow a sequential generation and reconstruction workflow. However, intermediate multi-view images synthesized by pre-trained generation models often lack cross-view consistency (CVC), significantly degrading 3D reconstruction performance. While recent methods attempt to refine CVC by feeding reconstruction results back into the multi-view generator, these approaches struggle with noisy and unstable reconstruction outputs that limit effective CVC improvement. We introduce AlignCVC, a novel framework that fundamentally re-frames single-image-to-3D generation through distribution alignment rather than relying on strict regression losses. Our key insight is to align both generated and reconstructed multi-view distributions toward the ground-truth multi-view distribution, establishing a principled foundation for improved CVC. Observing that generated images exhibit weak CVC while reconstructed images display strong CVC due to explicit rendering, we propose a soft-hard alignment strategy with distinct objectives for generation and reconstruction models. This approach not only enhances generation quality but also dramatically accelerates inference to as few as 4 steps. As a plug-and-play paradigm, our method, namely AlignCVC, seamlessly integrates various multi-view generation models with 3D reconstruction models. Extensive experiments demonstrate the effectiveness and efficiency of AlignCVC for single-image-to-3D generation.

作者：Xinyue Liang、Zhiyuan Ma、Lingchen Sun、Yanjun Guo、Lei Zhang

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Xinyue Liang,Zhiyuan Ma,Lingchen Sun,Yanjun Guo,Lei Zhang.AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation[EB/OL].(2025-06-29)[2025-07-21].https://arxiv.org/abs/2506.23150.点此复制

AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation

AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation

评论