中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

重庆交通大学学报(自然科学版) ›› 2025, Vol. 44 ›› Issue (10): 43-50.DOI: 10.3969/j.issn.1674-0696.2025.10.06

• 交通+大数据人工智能 • 上一篇    

扩展金字塔及轴向交叉注意力的道路语义分割

邬开俊1,张治瑞1,吴晓强2   

  1. (1.兰州交通大学 电子与信息工程学院, 甘肃 兰州 730070; 2.鄂尔多斯应用技术学院 机械与交通工程系,内蒙古 鄂尔多斯 017000)
  • 收稿日期:2024-09-19 修回日期:2025-09-10 发布日期:2025-11-06
  • 作者简介:邬开俊(1978—)男,山东莒南人,教授,主要从事深度学习、计算机视觉方面的研究。E-mail:wkj@mail.lzjtu.cn
  • 基金资助:
    甘肃省自然科学基金项目(23JRRA913);内蒙古自治区重点研发与成果转化计划项目(2023YFSH0043,2023YFDZ0043)

Semantic Segmentation for Roads under Extended Pyramid and Axial Cross Attention

WU Kaijun1, ZHANG Zhirui1, WU Xiaoqiang2   

  1. (1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070,Gansu, China; 2. School of Mechanical and Traffic Engineering, Ordos Institute of Technology,Ordos 017000, Inner Mongolia,China )
  • Received:2024-09-19 Revised:2025-09-10 Published:2025-11-06

摘要: 针对城市道路语义分割网络模型目前存在分割边界模糊、目标图像语义分割精度不足等问题,提出了扩展金字塔及轴向交叉注意力的道路语义分割网络。首先,骨干网络采用计算更快、节省内存且更加灵活的改进型重参数化卷积神经网络(RepVGG+);其次,为增强模型对全局信息的表达能力,提出扩展特征金字塔模块(EFPN)并设计多分支扩展卷积加速模块(ECAM)提升语义分割效果;为提升网络模型对分割边界的关注度,设计了多尺度轴向交叉注意力(MSACA)模块并利用空间通道模块(SCBlock)替换普通卷积以去除空间信息冗余。研究结果表明:笔者模型能改善分割边界模糊的问题,并提升对目标图像的分割精度,在Cityscapes数据集上mIoU值达到81.3%,比基础模型(Deeplabv3plus)提高4.8%,达到目前语义分割在此数据集上的良好水平。

关键词: 交通运输工程;深度学习;语义分割;特征金字塔;注意力机制

Abstract: Aiming at the existing problems of the semantic segmentation network model of urban road, such as fuzzy segmentation boundary and insufficient semantic segmentation accuracy of target image, the semantic segmentation network for roads under extended pyramid and axial cross attention method was proposed. Firstly, the backbone network used an improved reparametrized VGG network (RepVGG+), which was faster in computation, memory-efficient and more flexible. Secondly, to enhance the ability of the model to express global information, the extended feature pyramid network (EFPN) was proposed and the multi-branch extended convolution acceleration module (ECAM) was designed to improve semantic segmentation effect. At the same time, to enhance the network model's attention to segmentation boundaries, a multi-scale axial-cross attention (MSACA) module was designed, and the spatial channel block (SCBlock) was utilized to replace ordinary convolution, thereby removing spatial information redundancy. The research results show that the proposed model can improve the issue of blurry segmentation boundaries and enhance the segmentation accuracy of the target image. The mIoU value on Cityscapes dataset reaches 81.3%, which is 4.8% higher than that of the basic model (DeepLabv3plus), reaching a good level of semantic segmentation on this dataset so far.

Key words: traffic and transportation engineering; deep learning; semantic segmentation; feature pyramid; attention mechanism

中图分类号: