Multimodal Generation of Animatable 3D Human Models with AvatarForge
Multimodal Generation of Animatable 3D Human Models with AvatarForge
We introduce AvatarForge, a framework for generating animatable 3D human avatars from text or image inputs using AI-driven procedural generation. While diffusion-based methods have made strides in general 3D object generation, they struggle with high-quality, customizable human avatars due to the complexity and diversity of human body shapes, poses, exacerbated by the scarcity of high-quality data. Additionally, animating these avatars remains a significant challenge for existing methods. AvatarForge overcomes these limitations by combining LLM-based commonsense reasoning with off-the-shelf 3D human generators, enabling fine-grained control over body and facial details. Unlike diffusion models which often rely on pre-trained datasets lacking precise control over individual human features, AvatarForge offers a more flexible approach, bringing humans into the iterative design and modeling loop, with its auto-verification system allowing for continuous refinement of the generated avatars, and thus promoting high accuracy and customization. Our evaluations show that AvatarForge outperforms state-of-the-art methods in both text- and image-to-avatar generation, making it a versatile tool for artistic creation and animation.
Chi-Keung Tang、Xinhang Liu、Yu-Wing Tai
计算技术、计算机技术
Chi-Keung Tang,Xinhang Liu,Yu-Wing Tai.Multimodal Generation of Animatable 3D Human Models with AvatarForge[EB/OL].(2025-03-11)[2025-05-26].https://arxiv.org/abs/2503.08165.点此复制
评论