|国家预印本平台
首页|Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis

Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis

Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis

来源:Arxiv_logoArxiv
英文摘要

Early-stage scoliosis is often difficult to detect, particularly in adolescents, where delayed diagnosis can lead to serious health issues. Traditional X-ray-based methods carry radiation risks and rely heavily on clinical expertise, limiting their use in large-scale screenings. To overcome these challenges, we propose a Text-Guided Multi-Instance Learning Network (TG-MILNet) for non-invasive scoliosis detection using gait videos. To handle temporal misalignment in gait sequences, we employ Dynamic Time Warping (DTW) clustering to segment videos into key gait phases. To focus on the most relevant diagnostic features, we introduce an Inter-Bag Temporal Attention (IBTA) mechanism that highlights critical gait phases. Recognizing the difficulty in identifying borderline cases, we design a Boundary-Aware Model (BAM) to improve sensitivity to subtle spinal deviations. Additionally, we incorporate textual guidance from domain experts and large language models (LLM) to enhance feature representation and improve model interpretability. Experiments on the large-scale Scoliosis1K gait dataset show that TG-MILNet achieves state-of-the-art performance, particularly excelling in handling class imbalance and accurately detecting challenging borderline cases. The code is available at https://github.com/lhqqq/TG-MILNet

Haiqing Li、Yuzhi Guo、Feng Jiang、Thao M. Dang、Hehuan Ma、Qifeng Zhou、Jean Gao、Junzhou Huang

医学研究方法医学现状、医学发展

Haiqing Li,Yuzhi Guo,Feng Jiang,Thao M. Dang,Hehuan Ma,Qifeng Zhou,Jean Gao,Junzhou Huang.Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis[EB/OL].(2025-07-01)[2025-07-18].https://arxiv.org/abs/2507.02996.点此复制

评论