MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM
MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM
Recent advances in static 3D generation have intensified the demand for physically consistent dynamic 3D content. However, existing video generation models, including diffusion-based methods, often prioritize visual realism while neglecting physical plausibility, resulting in implausible object dynamics. Prior approaches for physics-aware dynamic generation typically rely on large-scale annotated datasets or extensive model fine-tuning, which imposes significant computational and data collection burdens and limits scalability across scenarios. To address these challenges, we present MAGIC, a training-free framework for single-image physical property inference and dynamic generation, integrating pretrained image-to-video diffusion models with iterative LLM-based reasoning. Our framework generates motion-rich videos from a static image and closes the visual-to-physical gap through a confidence-driven LLM feedback loop that adaptively steers the diffusion model toward physics-relevant motion. To translate visual dynamics into controllable physical behavior, we further introduce a differentiable MPM simulator operating directly on 3D Gaussians reconstructed from the single image, enabling physically grounded, simulation-ready outputs without any supervision or model tuning. Experiments show that MAGIC outperforms existing physics-aware generative methods in inference accuracy and achieves greater temporal coherence than state-of-the-art video diffusion models.
Siwei Meng、Yawei Luo、Ping Liu
信息科学、信息技术控制理论、控制技术计算技术、计算机技术
Siwei Meng,Yawei Luo,Ping Liu.MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM[EB/OL].(2025-05-22)[2025-06-21].https://arxiv.org/abs/2505.16456.点此复制
评论