中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

Journal of Chongqing Jiaotong University(Natural Science) ›› 2024, Vol. 43 ›› Issue (5): 61-69.DOI: 10.3969/j.issn.1674-0696.2024.05.09

• Transportation+Big Data & Artificial Intelligence • Previous Articles    

Traffic Flow Anomaly Data Detection Model Based on Improved Isolation Forest Algorithm

GONG Xiaoxing, DONG Peixin   

  1. (Collage of Traffic and Transportation Engineering, Dalian Maritime University, Dalian 116026, Liaoning, China)
  • Received:2023-07-04 Revised:2023-12-23 Published:2024-05-20

基于改进孤立森林算法的交通流异常数据检测模型

宫晓婞,董培信   

  1. (大连海事大学 交通运输工程学院,辽宁 大连 116026)
  • 作者简介:宫晓婞(1983—),女,辽宁大连人,副教授, 博士,主要从事交通运输规划与管理方面研究。E-mail:gong_xiaoxing@dlmu.edu.cn 通信作者:董培信(1999—),男,山东滨州人,硕士研究生,主要从事交通运输规划与管理方面研究。E-mail:dpx6669@dlmu.edu.cn
  • 基金资助:
    国家自然科学基金项目(71974023)

Abstract: For the problem of real-time detection of traffic flow anomaly data, a traffic flow anomaly data detection model based on the combination of improved isolated forest and K-Means++ algorithm was proposed. Firstly, traffic flow sequences were constructed by using traffic flow and traffic flow speed data. Then, the anomaly scoring model of the traffic flow data was constructed by the improved isolated forest algorithm, and the sliding window was constructed by the K-Means++ algorithm to calculate the threshold value of the anomaly scoring, which realized the real-time detection of abnormal values of traffic flow data. Finally, the rationality and feasibility of the proposed model were verified through case study. The research results show that the improved isolated forest method combined with K-Means++ can accurately determine the threshold value of anomaly scoring and then detect the anomalous data. AUC of the proposed model is respectively 29.7% and 5.3% higher than that of the models only considering traffic flow and traditional isolated forest models. Compared with other commonly used LOF, ABOD, and OCSVM methods, AUC of the proposed model has improved. The proposed model has a significant improvement in accuracy and better applicability in detecting abnormal traffic flow data, which can provide traffic condition detection support for traffic management departments and improve traffic management efficiency.

Key words: traffic engineering; anomaly detection model; improved isolation forest algorithm; traffic flow data; K-Means++ algorithm

摘要: 针对交通流异常数据实时检测问题,提出一种基于改进孤立森林算法与K-Means++算法相结合的交通流异常数据检测模型。首先,使用交通流量和交通流速度数据构建交通流序列;然后,利用改进孤立森林算法,构建交通流数据的异常评分模型,并通过K-Means++算法构建滑动窗口计算出异常评分的阈值,以此来实现对交通流数据异常值的实时检测;最后,通过实例分析验证模型的合理性和可行性。研究结果表明:改进孤立森林算法与K-Means++结合的方法可以准确地确定异常评分的阈值进而检测出异常数据;该模型与仅考虑交通流流量的模型、传统孤立森林模型相比,AUC分别高出29.7%和5.3%,与其他常用的LOF、ABOD、OCSVM方法相比,AUC均有所提高。该模型准确率明显提升,在交通流异常数据检测中具有更好的适用性,能够为交通管理部门提供交通状况检测支持,提高交通管理效率。

关键词: 交通工程;异常检测模型;改进孤立森林算法;交通流数据;K-Means++算法

CLC Number: