SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition
SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition
Sign language recognition is crucial for individuals with hearing impairments to break communication barriers. However, previous approaches have had to choose between efficiency and accuracy. Such as RNNs, LSTMs, and GCNs, had problems with vanishing gradients and high computational costs. Despite improving performance, transformer-based methods were not commonly used. This study presents a new novel SLR approach that overcomes the challenge of independently extracting meaningful information from the x and y coordinates of skeleton sequences, which traditional models often treat as inseparable. By utilizing an encoder-decoder of BART architecture, the model independently encodes the x and y coordinates, while Cross-Attention ensures their interrelation is maintained. With only 749,888 parameters, the model achieves 96.04% accuracy on the LSA-64 dataset, significantly outperforming previous models with over one million parameters. The model also demonstrates excellent performance and generalization across WLASL and ASL-Citizen datasets. Ablation studies underscore the importance of coordinate projection, normalization, and using multiple skeleton components for boosting model efficacy. This study offers a reliable and effective approach for sign language recognition, with strong potential for enhancing accessibility tools for the deaf and hard of hearing.
Tinh Nguyen、Minh Khue Phan Tran
语言学计算技术、计算机技术
Tinh Nguyen,Minh Khue Phan Tran.SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition[EB/OL].(2025-06-18)[2025-07-21].https://arxiv.org/abs/2506.21592.点此复制
评论