中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

Journal of Chongqing Jiaotong University(Natural Science) ›› 2025, Vol. 44 ›› Issue (9): 84-92.DOI: 10.3969/j.issn.1674-0696.2025.09.11

• Transportation+Big Data & Artificial Intelligence • Previous Articles    

Mobile ViT Network Road Defect Detection Model Integrating Multi-task Learning

LIU Yunfei1, LI Shuang1, MA Jianxiao2   

  1. (1. School of Information Science and Technology, Nanjing Forestry University, Nanjing 210037, Jiangsu, China; 2. School of Automobile and Traffic Engineering, Nanjing Forestry University, Nanjing 210037, Jiangsu, China)
  • Received:2024-10-02 Revised:2025-02-19 Published:2025-09-29

融合多任务学习的MobileViT网络道路缺陷检测模型

刘云飞1,李爽1,马健霄2   

  1. (1. 南京林业大学 信息科学技术学院,江苏 南京 210037; 2. 南京林业大学 汽车与交通工程学院,江苏 南京 210037)
  • 作者简介:刘云飞(1962—),男,江苏南京人,博士,教授,主要从事信号处理与无损检测方面的研究。E-mail:lyf@njfu.edu.cn 通信作者:李爽(2001—),男,安徽安庆人,硕士,主要从事计算机视觉方面的研究。E-mail:lishuang@njfu.edu.cn
  • 基金资助:
    2022年江苏省交通运输科技与成果转化项目(2022Y10)

Abstract: With the widespread application of deep learning technology in the field of computer vision, significant progress has been made in road defect detection based on deep learning. Addressing the issues of insufficient detection accuracy, high miss detection rates and difficulties in detecting small objects when dealing with complex road scenes with existing methods, an innovative multi-task learning road defect detection model (MTL-RDD) was proposed, which enhanced the detection performance by simultaneously optimizing object detection and semantic segmentation tasks. In the proposed model, a lightweight MobileViT architecture based on Transformer was adopted as the backbone network for efficient feature extraction, and the GELAN structure was used to realize the multi-scale information integration, effectively reducing inference time. Through fine-grained supervision of segmentation tasks, MTL-RDD improved the robustness and generalization ability of the proposed model, demonstrating excellent performance especially in complex scenarios. Experimental results demonstrate that MTL-RDD achieves a 2.9% and 3.5% improvement in mAP@0.5-0.95 and mAP@0.5, respectively, compared to YOLOv8-s, outperforming existing mainstream methods in terms of accuracy, speed and small object detection. The proposed detection model provides a more precise and efficient solution for road defect detection.

Key words: traffic and transportation engineering; defect detection algorithm; multi-task learning; neural networks; GELAN fusion

摘要: 随着深度学习技术在计算机视觉领域的广泛应用,基于深度学习的道路缺陷检测技术取得了显著进展。针对现有方法在处理复杂道路场景时,检测精度不足、漏检率高和小目标检测困难等问题,提出了一种创新的多任务学习道路缺陷检测模型(MTL-RDD),通过同时优化目标检测和语义分割任务来提升检测性能。该模型采用基于Transformer的轻量化MobileViT结构作为主干网络,实现高效特征提取,并通过GELAN结构实现多尺度信息融合,有效降低推理耗时。通过分割任务的精细化监督,MTL-RDD增强了模型的鲁棒性和泛化能力,尤其在复杂场景中展现出卓越的表现。实验结果表明:MTL-RDD在平均精度mAP@0.5-0.95和mAP@0.5指标上较YOLOv8-s分别提升了2.9%和3.5%,在精度、速度和小目标检测方面均优于现有主流方法。提出的检测模型为道路缺陷检测领域提供了更为精准和高效的解决方案。

关键词: 交通运输工程;缺陷检测算法;多任务学习;神经网络;GELAN融合

CLC Number: