中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

重庆交通大学学报(自然科学版) ›› 2026, Vol. 45 ›› Issue (5): 97-104.DOI: 10.3969/j.issn.1674-0696.2026.05.11

• 交通运输+人工智能 • 上一篇    

融合ITransformer的三段式行人轨迹预测模型

高尚兵1,2,王浩1,2,冯逸心1,2   

  1. (1. 淮阴工学院 计算机与软件工程学院,江苏 淮安 223003; 2. 可信固件与智能软件江苏高校重点实验室,江苏 淮安 223003)
  • 收稿日期:2025-08-26 修回日期:2025-12-11 发布日期:2026-06-08
  • 作者简介:高尚兵(1981—),男,江苏淮安人,教授,博士,主要从事计算机视觉、智能交通方面的研究。E-mail:luxiaofen_2002@126.com 通信作者:王浩(2001—),男,河南驻马店人,硕士,主要从事智能交通方面的研究。E-mail:3028069749@qq.com
  • 基金资助:
    国家自然科学基金联合基金重点项目(U24A20330);江苏省研究生实践创新计划项目(SJCX25_2189,SJCX25_2191)

Three-Stage Pedestrian Trajectory Prediction Model Integrating ITransformer

GAO Shangbing1,2, WANG Hao1,2, FENG Yixin1,2   

  1. (1. College of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian 223003, Jiangsu, China; 2. Key Laboratory of Trusted Firmware and Intelligent Software of Jiangsu Province, Huaian 223003, Jiangsu, China)
  • Received:2025-08-26 Revised:2025-12-11 Published:2026-06-08

摘要: 在复杂多变的交通环境中,精准预测行人轨迹对自动驾驶、服务机器人、智慧城市等领域至关重要。然而,轨迹分布的高度不确定性,尤其是短期和长期预测在空间分布上的差异,为建模带来了挑战。为此,提出一种融合ITransformer思想的三段式行人轨迹预测框架ITER(inverse trajectory transformer),通过分阶段策略,分层捕捉不同时间尺度下的运动规律:首先,通过短期预测学习局部动态模式;其次,建模终点的潜在分布,以掌握长期轨迹的全局趋势;最后,融合前两阶段的信息,实现对完整未来轨迹的精细建模。实验结果表明:在ETH/UCY数据集的5个场景中,ITER相比Agentformer模型,平均位移误差降低13.04%,最终位移误差降低20.51%;在SDD数据集中,所提出的方法相比Y-Net模型,平均位移误差降低8.28%,最终位移误差降低8.02%。研究表明:所提模型取得了显著性能提升,能够生成准确而稳定的未来轨迹,满足了行人轨迹的精确预测。

关键词: 交通运输工程;行人轨迹预测;多模态预测;注意力机制;多阶段学习

Abstract: In complex and dynamic traffic environments, accurately predicting pedestrian trajectories is crucial for fields such as autonomous driving, service robots, and smart cities. However, the high uncertainty of trajectory distribution, especially the spatial distribution differences between short-term and long-term predictions, poses challenges for modeling. To address this, a three-stage pedestrian trajectory prediction framework incorporating the idea of ITransformer, ITER (inverse trajectory transformer), was proposed. Through a staged strategy, the motion patterns at different time scales were captured in a hierarchical manner. Firstly, through short-term prediction, local dynamic patterns were learned. Secondly, the potential distribution of endpoints was modeled to grasp the global trend of long-term trajectories. Finally, fusing the information from the first two stages, the fine modeling of the complete future trajectory was achieved. Experimental results show that, in five scenes of the ETH/UCY dataset, ITER reduces the average displacement error by 13.04% and the final displacement error by 20.51% compared to the Agentformer model. In the SDD dataset, the proposed method reduces the average displacement error by 8.28% and the final displacement error by 8.02% compared to the Y-Net model. The research indicates that the proposed model achieves significant performance improvement, which can generate accurate and stable future trajectories, achieving the precise prediction for pedestrian trajectories.

Key words: traffic and transportation engineering; pedestrian trajectory prediction; multimodal prediction; attention mechanism; multi-stage learning

中图分类号: