首页|Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

来源：

英文摘要

This paper introduces a novel deep learning-based multimodal fusion architecture aimed at enhancing the perception capabilities of autonomous navigation robots in complex environments. By utilizing innovative feature extraction modules, adaptive fusion strategies, and time-series modeling mechanisms, the system effectively integrates RGB images and LiDAR data. The key contributions of this work are as follows: a. the design of a lightweight feature extraction network to enhance feature representation; b. the development of an adaptive weighted cross-modal fusion strategy to improve system robustness; and c. the incorporation of time-series information modeling to boost dynamic scene perception accuracy. Experimental results on the KITTI dataset demonstrate that the proposed approach increases navigation and positioning accuracy by 3.5% and 2.2%, respectively, while maintaining real-time performance. This work provides a novel solution for autonomous robot navigation in complex environments.

作者：Delun Lai、Yeyubei Zhang、Yunchong Liu、Chaojie Li、Huadong Mo

作者单位：

学科分类：自动化技术、自动化技术设备

推荐引用：Delun Lai,Yeyubei Zhang,Yunchong Liu,Chaojie Li,Huadong Mo.Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation[EB/OL].(2025-04-26)[2025-05-05].https://arxiv.org/abs/2504.19002.点此复制

Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

评论