Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition
Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition
Sign language recognition (SLR) faces fundamental challenges in creating accurate annotations due to the inherent complexity of simultaneous manual and non-manual signals. To the best of our knowledge, this is the first work to integrate generative large language models (LLMs) into SLR tasks. We propose a novel Generative Sign-description Prompts Multi-positive Contrastive learning (GSP-MC) method that leverages retrieval-augmented generation (RAG) with domain-specific LLMs, incorporating multi-step prompt engineering and expert-validated sign language corpora to produce precise multipart descriptions. The GSP-MC method also employs a dual-encoder architecture to bidirectionally align hierarchical skeleton features with multiple text descriptions (global, synonym, and part level) through probabilistic matching. Our approach combines global and part-level losses, optimizing KL divergence to ensure robust alignment across all relevant text-skeleton pairs while capturing both sign-level semantics and detailed part dynamics. Experiments demonstrate state-of-the-art performance against existing methods on the Chinese SLR500 (reaching 97.1%) and Turkish AUTSL datasets (97.07% accuracy). The method's cross-lingual effectiveness highlight its potential for developing inclusive communication technologies.
Siyu Liang、Yunan Li、Wentian Xin、Huizhou Chen、Xujie Liu、Kang Liu、Qiguang Miao
计算技术、计算机技术
Siyu Liang,Yunan Li,Wentian Xin,Huizhou Chen,Xujie Liu,Kang Liu,Qiguang Miao.Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition[EB/OL].(2025-05-04)[2025-06-21].https://arxiv.org/abs/2505.02304.点此复制
评论