|国家预印本平台
首页|HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

来源:Arxiv_logoArxiv
英文摘要

Integrating Large Language Models (LLMs) with Reinforcement Learning (RL) can enhance autonomous driving (AD) performance in complex scenarios. However, current LLM-Dominated RL methods over-rely on LLM outputs, which are prone to hallucinations. Evaluations show that state-of-the-art LLM indicates a non-hallucination rate of only approximately 57.95% when assessed on essential driving-related tasks. Thus, in these methods, hallucinations from the LLM can directly jeopardize the performance of driving policies. This paper argues that maintaining relative independence between the LLM and the RL is vital for solving the hallucinations problem. Consequently, this paper is devoted to propose a novel LLM-Hinted RL paradigm. The LLM is used to generate semantic hints for state augmentation and policy optimization to assist RL agent in motion planning, while the RL agent counteracts potential erroneous semantic indications through policy learning to achieve excellent driving performance. Based on this paradigm, we propose the HCRMP (LLM-Hinted Contextual Reinforcement Learning Motion Planner) architecture, which is designed that includes Augmented Semantic Representation Module to extend state space. Contextual Stability Anchor Module enhances the reliability of multi-critic weight hints by utilizing information from the knowledge base. Semantic Cache Module is employed to seamlessly integrate LLM low-frequency guidance with RL high-frequency control. Extensive experiments in CARLA validate HCRMP's strong overall driving performance. HCRMP achieves a task success rate of up to 80.3% under diverse driving conditions with different traffic densities. Under safety-critical driving conditions, HCRMP significantly reduces the collision rate by 11.4%, which effectively improves the driving performance in complex scenarios.

Zhiwen Chen、Bo Leng、Zhuoren Li、Hanming Deng、Guizhe Jin、Ran Yu、Huanxi Wen

自动化技术、自动化技术设备计算技术、计算机技术

Zhiwen Chen,Bo Leng,Zhuoren Li,Hanming Deng,Guizhe Jin,Ran Yu,Huanxi Wen.HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving[EB/OL].(2025-05-21)[2025-06-06].https://arxiv.org/abs/2505.15793.点此复制

评论