结合状态预测的深度强化学习交通信号控制
深度强化学习(Deep Reinforcement Learning,DRL)可广泛应用于城市交通信号控制领域,但在现有研究中,绝大多数的DRL智能体仅使用当前的交通状态进行决策,在交通流变化较大的情况下控制效果有限。文中提出一种结合状态预测的DRL信号控制算法。首先,利用独热编码设计简洁且高效的交通状态;然后,使用长短期记忆网络(Long Short-Term Memory,LSTM)预测未来的交通状态;最后,智能体根据当前状态和预测状态进行最优决策。在SUMO(Simulation of Urban Mobility)仿真平台上的实验结果表明,在单交叉口、多交叉口的多种交通流量条件下,与三种典型的信号控制算法相比,所提算法在平均等待时间、行驶时间、燃油消耗、CO2排放等指标上都具有最好的性能。
Urban traffic signal control can widely use deep reinforcement learning (DRL) technique. However, in existing researches, most DRL agents only use the current traffic state to make decisions and have limited control effects when the traffic flow changes greatly. Aiming at the problem, this paper proposed a state prediction based deep reinforcement learning algorithm for traffic signal control. The algorithm used one-hot coding to design a concise and efficient traffic state, and then used a Long Short-Term Memory (LSTM) to predict the future state. The agent made optimal decisions based on the current state and the predicted state. The experimental results on the simulation platform SUMO show that compared with three typical signal control algorithms, the proposed algorithm has the best performance in terms of average waiting time, travel time, fuel consumption, CO2 emissions and cumulative reward both in a single intersection and multiple intersections under different flow conditions.
李涛、唐慕尧、周大可
公路运输工程自动化技术、自动化技术设备计算技术、计算机技术
交通信号控制状态预测深度强化学习深度Q网络长短期记忆网络
李涛,唐慕尧,周大可.结合状态预测的深度强化学习交通信号控制[EB/OL].(2022-04-07)[2025-08-16].https://chinaxiv.org/abs/202204.00039.点此复制
评论