LiDAR-based Object Detection with Real-time Voice Specifications
LiDAR-based Object Detection with Real-time Voice Specifications
This paper presents a LiDAR-based object detection system with real-time voice specifications, integrating KITTI's 3D point clouds and RGB images through a multi-modal PointNet framework. It achieves 87.0% validation accuracy on a 3000-sample subset, surpassing a 200-sample baseline of 67.5% by combining spatial and visual data, addressing class imbalance with weighted loss, and refining training via adaptive techniques. A Tkinter prototype provides natural Indian male voice output using Edge TTS (en-IN-PrabhatNeural), alongside 3D visualizations and real-time feedback, enhancing accessibility and safety in autonomous navigation, assistive technology, and beyond. The study offers a detailed methodology, comprehensive experimental analysis, and a broad review of applications and challenges, establishing this work as a scalable advancement in human-computer interaction and environmental perception, aligned with current research trends.
Anurag Kulkarni
自动化技术、自动化技术设备计算技术、计算机技术
Anurag Kulkarni.LiDAR-based Object Detection with Real-time Voice Specifications[EB/OL].(2025-04-03)[2025-06-15].https://arxiv.org/abs/2504.02920.点此复制
评论