中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

Journal of Chongqing Jiaotong University(Natural Science) ›› 2022, Vol. 41 ›› Issue (08): 24-29.DOI: 10.3969/j.issn.1674-0696.2022.08.04

• Transportation+Big Data & Artificial Intelligence • Previous Articles     Next Articles

Adaptive Traffic Signal Control Based on Deep Reinforcement Learning

XU Jianmin1, ZHOU Xiangpeng1, SHOU Yanfang2   

  1. (1. School of Civil and Transportation Engineering, South China University of Technology, Guangzhou 510640, Guangdong, China; 2. Guangzhou Institute of Modern Industrial Technology, South China University of Technology, Guangzhou 510640, Guangdong, China)
  • Received:2021-01-14 Revised:2021-09-03 Published:2022-08-19

基于深度强化学习的自适应交通信号控制研究

徐建闽1,周湘鹏1,首艳芳2   

  1. (1. 华南理工大学 土木与交通学院,广东 广州 510640; 2. 华南理工大学 广州现代产业技术研究院, 广东 广州 510640)
  • 作者简介:徐建闽(1960—),男,山东招远人,教授,博士,主要从事智能交通控制方面的研究。E-mail:aujmxu@scut.edu.cn
  • 基金资助:
    国家自然科学基金面上项目(61873098) ; 广东省自然科学基金项目(2018A030313250);广东省科技计划项目(2016A030305001)

Abstract: In order to improve the robustness and adaptability of traffic control algorithms and ease urban traffic congestion, an adaptive traffic signal control method based on improved D3QN (dueling double deep Q-network, D3QN) was proposed. Firstly, several adaptive traffic control modes based on reinforcement learning were analyzed. Subsequently, a variable step-size action mode was proposed based on the fixed step-size action mode and a reward function based on space occupancy was constructed. Finally, an intersection in East Street of Zhongshan was simulated by software Sumo in steady flow and stochastic flow. The simulation results show that the proposed method exhibits excellent convergence and effectively reduces the delay time and the queue length.

Key words: traffic engineering; traffic simulation; adaptive control; traffic flow; deep reinforcement learning

摘要: 为了提高交通控制算法的适应性和鲁棒性,缓解城市交通拥堵,提出了一种改进的D3QN(dueling double deep Q-network, D3QN)自适应信号控制方法。首先对几种强化学习自适应控制模式进行分析,然后在固定步长动作模式的基础上提出了不定步长动作模式,并构造了一种基于空间占有率的奖励函数;最后使用Sumo软件,对中山市东区街道某交叉口分别在稳定流和随机流场景下进行仿真。仿真结果表明:该方法具有良好的收敛性,有效地降低了延误时间和排队长度。

关键词: 交通工程;交通仿真;自适应控制;交通流;深度强化学习

CLC Number: