Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
Reinforcement Learning (RL) is increasingly used in autonomous driving (AD) and shows clear advantages. However, most RL-based AD methods overlook policy structure design. An RL policy that only outputs short-timescale vehicle control commands results in fluctuating driving behavior due to fluctuations in network outputs, while one that only outputs long-timescale driving goals cannot achieve unified optimality of driving behavior and control. Therefore, we propose a multi-timescale hierarchical reinforcement learning approach. Our approach adopts a hierarchical policy structure, where high- and low-level RL policies are unified-trained to produce long-timescale motion guidance and short-timescale control commands, respectively. Therein, motion guidance is explicitly represented by hybrid actions to capture multimodal driving behaviors on structured road and support incremental low-level extend-state updates. Additionally, a hierarchical safety mechanism is designed to ensure multi-timescale safety. Evaluation in simulator-based and HighD dataset-based highway multi-lane scenarios demonstrates that our approach significantly improves AD performance, effectively increasing driving efficiency, action consistency and safety.
Guizhe Jin、Zhuoren Li、Bo Leng、Ran Yu、Lu Xiong
公路运输工程自动化技术、自动化技术设备
Guizhe Jin,Zhuoren Li,Bo Leng,Ran Yu,Lu Xiong.Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving[EB/OL].(2025-06-30)[2025-07-16].https://arxiv.org/abs/2506.23771.点此复制
评论