VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People
VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People
Virtual Reality (VR) is inaccessible to blind people. While research has investigated many techniques to enhance VR accessibility, they require additional developer effort to integrate. As such, most mainstream VR apps remain inaccessible as the industry de-prioritizes accessibility. We present VRSight, an end-to-end system that recognizes VR scenes post hoc through a set of AI models (e.g., object detection, depth estimation, LLM-based atmosphere interpretation) and generates tone-based, spatial audio feedback, empowering blind users to interact in VR without developer intervention. To enable virtual element detection, we further contribute DISCOVR, a VR dataset consisting of 30 virtual object classes from 17 social VR apps, substituting real-world datasets that remain not applicable to VR contexts. Nine participants used VRSight to explore an off-the-shelf VR app (Rec Room), demonstrating its effectiveness in facilitating social tasks like avatar awareness and available seat identification.
Daniel Killough、Justin Feng、Zheng Xue "ZX" Ching、Daniel Wang、Rithvik Dyava、Yapeng Tian、Yuhang Zhao
计算技术、计算机技术
Daniel Killough,Justin Feng,Zheng Xue "ZX" Ching,Daniel Wang,Rithvik Dyava,Yapeng Tian,Yuhang Zhao.VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People[EB/OL].(2025-08-04)[2025-08-16].https://arxiv.org/abs/2508.02958.点此复制
评论