中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

Journal of Chongqing Jiaotong University(Natural Science) ›› 2026, Vol. 45 ›› Issue (3): 73-80.DOI: 10.3969/j.issn.1674-0696.2026.03.09

• Traffic & Transportation+Artificial Intelligence • Previous Articles    

Short-Term Traffic State Prediction Based on Traffic Flow Parameters and Image Latent Features

ZHANG Mengmeng, WANG Haonan, LIN Qinghai, YU Lei   

  1. (School of Traffic and Logistics Engineering, Shandong Jiaotong University, Ji’nan 250357, Shandong, China)
  • Received:2025-06-06 Revised:2025-12-05 Published:2026-03-24

基于交通流参数和图像潜在特征的短时交通状态预测

张萌萌,王浩楠,蔺庆海,于雷   

  1. (山东交通学院 交通与物流工程学院,山东 济南 250357)
  • 作者简介:张萌萌(1981—),女,山东宁阳人,教授,博士,主要从事智能交通、交通大数据方面的研究。E-mail:zhangmengmeng@sdjtu.edu.cn 通信作者:王浩楠(1999—),男,山东烟台人,硕士研究生,主要从事智能交通方面的研究。E-mail:czx43957@163.com
  • 基金资助:
    国家自然科学基金项目(52502405);山东省自然科学基金创新发展联合基金项目(ZR2024LZN008);山东省自然科学基金项目(ZR2024QG058);山东省交通运输厅项目(2023B74);济南市哲学社会科学规划研究项目(JNSK2025C080)

Abstract: Accurate traffic state prediction is crucial for alleviating traffic congestion and advancing intelligent traffic management. Existing methods primarily rely on sectional traffic flow parameters obtained from ground sensors, neglecting differences in states among multiple lanes and the dynamic changes in the overall traffic state within the area, which results in limited prediction accuracy in complex traffic environments such as weaving areas. To address this, TraP-VisNet, a short-term traffic state prediction model based on traffic flow parameters and image latent features, was proposed from a UAV perspective. Firstly, by analyzing vehicle trajectories in the video, the spatial average speed and spatial occupancy rate were extracted to characterize the traffic state within the area. Simultaneously, an improved variational autoencoder (VAE) with temporal modeling and semantic guidance mechanisms was used to encode image frame sequences and extract latent features that could reflect traffic flow evolution. Then, to achieve multimodal feature fusion, attention mechanisms were introduced in both temporal and feature dimensions, and the data scale was unified through temporal alignment and Z-score normalization. Finally, the fused feature vectors were input into a multi-layer perceptron (MLP) for regression prediction of traffic state, and the state level classification was achieved based on random forest (RF). Experiment results demonstrate that the proposed model exhibits superior prediction capability and stability across various scenarios. Compared to models based solely on traffic flow parameters, models relying solely on image latent features and mainstream baseline methods such as Transformer, the proposed model achieves an average improvement of 19.96%, 15.15%, and 17.70% in accuracy, F1 score, and Matthews correlation coefficient (MCC), respectively, fully demonstrating its robustness and practical application potential in complex traffic environments.

Key words: traffic engineering; intelligent transportation; short-term traffic state prediction; traffic flow parameters; variational autoencoder; image latent features; UAV video

摘要: 准确的交通状态预测对缓解交通拥堵和推动智能化交通管理具有重要意义。现有方法主要依赖地面传感器获取的断面交通流参数,没有考虑多车道间的状态差异以及区域内整体交通状态的动态变化,导致在交织区等复杂交通环境下的预测精度受限。对此,提出一种无人机视角下基于交通流参数与图像潜在特征的短时交通状态预测模型(TraP-VisNet)。首先,通过分析视频中的车辆轨迹,提取空间平均速度与空间占有率,用于表征区域内的交通状态;同时,采用引入时序建模与语义引导机制的改进变分自编码器对图像帧序列进行编码,提取能够反映交通流演化的潜在特征。然后,为实现多模态特征融合,分别在时间和特征两个维度引入注意力机制,并通过时间对齐与Z-score归一化处理统一数据尺度。最后,将融合的特征向量输入多层感知机(MLP)进行交通状态回归预测,并基于随机森林(RF)实现状态等级分类。实验结果表明:所提出的模型在多个场景下都具备更强的预测能力与稳定性。与仅基于交通流参数、仅依赖图像潜在特征的模型以及Transformer等主流基线方法相比,所提出的模型在准确率、F1分数和马修斯相关系数(MCC)方面分别实现了19.96%、15.15%、17.70%的平均提升,充分验证了其在复杂交通环境下的鲁棒性和实际应用潜力。

关键词: 交通工程;智能交通;短时交通状态预测;交通流参数;变分自编码器;图像潜在特征;无人机视频

CLC Number: