Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation
Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation
Sign language video generation requires producing natural signing motions with realistic appearances under precise semantic control, yet faces two critical challenges: excessive signer-specific data requirements and poor generalization. We propose a new paradigm for sign language video generation that decouples motion semantics from signer identity through a two-phase synthesis framework. First, we construct a signer-independent multimodal motion lexicon, where each gloss is stored as identity-agnostic pose, gesture, and 3D mesh sequences, requiring only one recording per sign. This compact representation enables our second key innovation: a discrete-to-continuous motion synthesis stage that transforms retrieved gloss sequences into temporally coherent motion trajectories, followed by identity-aware neural rendering to produce photorealistic videos of arbitrary signers. Unlike prior work constrained by signer-specific datasets, our method treats motion as a first-class citizen: the learned latent pose dynamics serve as a portable "choreography layer" that can be visually realized through different human appearances. Extensive experiments demonstrate that disentangling motion from identity is not just viable but advantageous - enabling both high-quality synthesis and unprecedented flexibility in signer personalization.
Jiayi He、Xu Wang、Shengeng Tang、Yaxiong Wang、Lechao Cheng、Dan Guo
计算技术、计算机技术
Jiayi He,Xu Wang,Shengeng Tang,Yaxiong Wang,Lechao Cheng,Dan Guo.Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation[EB/OL].(2025-08-06)[2025-08-23].https://arxiv.org/abs/2508.04049.点此复制
评论