|国家预印本平台
首页|Hybrid Vision Transformer-Mamba Framework for Autism Diagnosis via Eye-Tracking Analysis

Hybrid Vision Transformer-Mamba Framework for Autism Diagnosis via Eye-Tracking Analysis

Hybrid Vision Transformer-Mamba Framework for Autism Diagnosis via Eye-Tracking Analysis

来源:Arxiv_logoArxiv
英文摘要

Accurate Autism Spectrum Disorder (ASD) diagnosis is vital for early intervention. This study presents a hybrid deep learning framework combining Vision Transformers (ViT) and Vision Mamba to detect ASD using eye-tracking data. The model uses attention-based fusion to integrate visual, speech, and facial cues, capturing both spatial and temporal dynamics. Unlike traditional handcrafted methods, it applies state-of-the-art deep learning and explainable AI techniques to enhance diagnostic accuracy and transparency. Tested on the Saliency4ASD dataset, the proposed ViT-Mamba model outperformed existing methods, achieving 0.96 accuracy, 0.95 F1-score, 0.97 sensitivity, and 0.94 specificity. These findings show the model's promise for scalable, interpretable ASD screening, especially in resource-constrained or remote clinical settings where access to expert diagnosis is limited.

Wafaa Kasri、Yassine Himeur、Abigail Copiaco、Wathiq Mansoor、Ammar Albanna、Valsamma Eapen

医学研究方法临床医学

Wafaa Kasri,Yassine Himeur,Abigail Copiaco,Wathiq Mansoor,Ammar Albanna,Valsamma Eapen.Hybrid Vision Transformer-Mamba Framework for Autism Diagnosis via Eye-Tracking Analysis[EB/OL].(2025-06-07)[2025-06-25].https://arxiv.org/abs/2506.06886.点此复制

评论