Mobile ViT Network Road Defect Detection Model Integrating Multi-task Learning

doi:10.3969/j.issn.1674-0696.2025.09.11

Journal of Chongqing Jiaotong University(Natural Science) ›› 2025, Vol. 44 ›› Issue (9): 84-92.DOI: 10.3969/j.issn.1674-0696.2025.09.11

• Transportation+Big Data & Artificial Intelligence • Previous Articles

Mobile ViT Network Road Defect Detection Model Integrating Multi-task Learning

LIU Yunfei1, LI Shuang1, MA Jianxiao2

(1. School of Information Science and Technology, Nanjing Forestry University, Nanjing 210037, Jiangsu, China; 2. School of Automobile and Traffic Engineering, Nanjing Forestry University, Nanjing 210037, Jiangsu, China)

Received:2024-10-02 Revised:2025-02-19 Published:2025-09-29

融合多任务学习的MobileViT网络道路缺陷检测模型

刘云飞1，李爽1，马健霄2

(1. 南京林业大学信息科学技术学院，江苏南京 210037； 2. 南京林业大学汽车与交通工程学院，江苏南京 210037)

作者简介:刘云飞(1962—)，男，江苏南京人，博士，教授，主要从事信号处理与无损检测方面的研究。E-mail:lyf@njfu.edu.cn 通信作者：李爽(2001—)，男，安徽安庆人，硕士，主要从事计算机视觉方面的研究。E-mail:lishuang@njfu.edu.cn
基金资助:
2022年江苏省交通运输科技与成果转化项目(2022Y10)

Abstract

Abstract: With the widespread application of deep learning technology in the field of computer vision, significant progress has been made in road defect detection based on deep learning. Addressing the issues of insufficient detection accuracy, high miss detection rates and difficulties in detecting small objects when dealing with complex road scenes with existing methods, an innovative multi-task learning road defect detection model (MTL-RDD) was proposed, which enhanced the detection performance by simultaneously optimizing object detection and semantic segmentation tasks. In the proposed model, a lightweight MobileViT architecture based on Transformer was adopted as the backbone network for efficient feature extraction, and the GELAN structure was used to realize the multi-scale information integration, effectively reducing inference time. Through fine-grained supervision of segmentation tasks, MTL-RDD improved the robustness and generalization ability of the proposed model, demonstrating excellent performance especially in complex scenarios. Experimental results demonstrate that MTL-RDD achieves a 2.9% and 3.5% improvement in mAP@0.5-0.95 and mAP@0.5, respectively, compared to YOLOv8-s, outperforming existing mainstream methods in terms of accuracy, speed and small object detection. The proposed detection model provides a more precise and efficient solution for road defect detection.

Key words: traffic and transportation engineering; defect detection algorithm; multi-task learning; neural networks; GELAN fusion

摘要： 随着深度学习技术在计算机视觉领域的广泛应用，基于深度学习的道路缺陷检测技术取得了显著进展。针对现有方法在处理复杂道路场景时，检测精度不足、漏检率高和小目标检测困难等问题，提出了一种创新的多任务学习道路缺陷检测模型(MTL-RDD)，通过同时优化目标检测和语义分割任务来提升检测性能。该模型采用基于Transformer的轻量化MobileViT结构作为主干网络，实现高效特征提取，并通过GELAN结构实现多尺度信息融合，有效降低推理耗时。通过分割任务的精细化监督，MTL-RDD增强了模型的鲁棒性和泛化能力，尤其在复杂场景中展现出卓越的表现。实验结果表明：MTL-RDD在平均精度mAP@0.5-0.95和mAP@0.5指标上较YOLOv8-s分别提升了2.9%和3.5%，在精度、速度和小目标检测方面均优于现有主流方法。提出的检测模型为道路缺陷检测领域提供了更为精准和高效的解决方案。

关键词: 交通运输工程；缺陷检测算法；多任务学习；神经网络；GELAN融合

CLC Number:

U416.2

LIU Yunfei1, LI Shuang1, MA Jianxiao2. Mobile ViT Network Road Defect Detection Model Integrating Multi-task Learning[J]. Journal of Chongqing Jiaotong University(Natural Science), 2025, 44(9): 84-92.

刘云飞1，李爽1，马健霄2. 融合多任务学习的MobileViT网络道路缺陷检测模型[J]. 重庆交通大学学报（自然科学版）, 2025, 44(9): 84-92.

References

［1］ ZHANG Hongkai, LI Suqiang, MIAO Qiqi, et al. Surface defect detection of hot rolled steel based on multi-scale feature fusion and attention mechanism residual block ［J］. Scientific Reports, 2024, 14(1): 7671.
［2］蓝章礼,徐元通,赵胜薇,等.基于Sobel算子桥接的双编码器路面裂缝检测网络［J］.重庆交通大学学报(自然科学版),2024,43(9):18-24.
LAN Zhangli, XU Yuantong, ZHAO Shengwei, et al. Dual encoder pavement crack detection network based on Sobel operator bridging ［J］. Journal of Chongqing Jiaotong University (Natural Science), 2024, 43(9): 18-24.
［3］彭磊,张辉.基于U-net的道路缺陷检测［J］.计算机科学,2021,48(增刊2):616-619.
PENG Lei, ZHANG Hui. Road defect detection based on U-net. ［J］. Computer Science, 2021, 48(Sup2): 616-619.
［4］ RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation［C］∥Medical Image Computing and Computer-assisted Intervention-MICCAI 2015: 18th International Conference. Munich, Germany: Springer International Publishing, 2015.
［5］ ARYA D, MAEDA H, GHOSH S K, et al. RDD2020: An annotated image dataset for automatic road damage detection using deep learning ［J］. Data in Brief, 2021, 36: 107133.
［6］ MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformers［C］∥ International Conference on Learning Representations (ICLR). Viana:［s.n.］, Austria, 2022.
［7］ HAN Kai, WANG Yunhe, CHEN Hanting, et al. A survey on vision transformer ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 87-110.
［8］ WANG C Y, YEH I H, MARK LIAO H Y. Computer Vision-ECCV 2024 ［M］. Cham: Springer Nature Switzerland, 2024: 1-21.
［9］ ZHANG Shifeng, CHI Cheng, YAO Yongqiang, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection ［C］∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020: 9759-9768.
［10］ LIN T Y, MAIRE M, BELONGIE S, et al. Computer Vision-ECCV 2014 ［M］. Cham: Springer International Publishing, 2014: 740-755.
［11］ LIU Wei, ANGUELOV D, ERHAN D, et al. Computer Vision-ECCV 2016 ［M］. Cham: Springer International Publishing, 2016: 21-37.
［12］ WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors ［C］∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Vancouver, BC, Canada: IEEE, 2023: 7464-7475.

[1]	WANG Yuanyuan1,2, LI Biyang1, ZHOU Zhe1, YANG Jianhua2, LIU Yanyan3. Binocular Fusion Structured Light Vision Algorithm and Complex Environment Impact Analysis in the Field of Road Applications [J]. Journal of Chongqing Jiaotong University(Natural Science), 2025, 44(7): 67-74.
[2]	LI Jun1, HE Chao1, LIU Zhen2, 3, GU Xingyu2, 3. Design Method of Asphalt Pavement Structure Parameters Based on Multi-scale Loading Test [J]. Journal of Chongqing Jiaotong University(Natural Science), 2024, 43(1): 31-38.
[3]	LI Zhiyong, ZHU Shangshu. Road Performance of Light-Colored Pavement Materials [J]. Journal of Chongqing Jiaotong University(Natural Science), 2023, 42(2): 37-43.
[4]	LI Zhongyu, FENG Hanqing, CONG Lin, CHEN Yonghui. Void Detection and Determination Method of Concrete Pavement Based on Support Vector Machine [J]. Journal of Chongqing Jiaotong University(Natural Science), 2023, 42(1): 66-73.
[5]	XIONG Rui1,2, FENG Baozhu1, QIAO Ning1, JI Kuo1, WANG Haoyu1. Asphalt Performance Evaluation Exposed to the Sulfate Attack and Freeze-Thaw Cycles [J]. Journal of Chongqing Jiaotong University(Natural Science), 2022, 41(04): 70-75.
[6]	YAO Ailing1, WANG Junwei2, XU Min3, YANG Mengqian4, YANG Tao1. Numerical Analysis of the Uplift of Cement Stabilized Macadam Base Asphalt Pavement [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(06): 105-111.
[7]	ZHAO Quanman, REN Ruibo, LIU Yao, LI Zhigang, HU Guiling. Fracture Mechanism of Pavement around Manholes in Urban Road [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(05): 87-94.
[8]	WANG Min1,2, SHANG Fei1, XIAO Li1, BAO Guangzhi3. Composite Beam Fatigue Damage Law of Steel Bridge Deck Gussasphalt Pavement [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(03): 84-88.
[9]	LI Shenglian1, LIANG Naixing2, ZENG Sheng2. Influence of Acquisition Methods on Segregation Evaluation of Digital Image of Asphalt Mixture [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(03): 103-107.
[10]	ZHOU Jiankun1, CAO Yuanwen2, WEN Yongjie2, ZHAO Jiang1,BAI Liping3. Digital Image Differences of Paved Pavement at Different Heights [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(02): 89-94.
[11]	LI Zhiyong, HUANG Hua. Tunnel Lighting Quality Based on Bright Color Pavement [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(02): 101-107.
[12]	LI Hao1,2, WANG Xuancang1, FANG Naren1, XU Xinquan3. Stress Transfer Behavior of Asphalt Pavement Structure Based on Sensitivity [J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(01): 119-126.
[13]	LIU Honghui, LI Xiaojuan. High Temperature Rheological Properties of Waterborne Epoxy Emulsified Asphalt [J]. Journal of Chongqing Jiaotong University(Natural Science), 2020, 39(10): 67-73.
[14]	ZHOU Shuiwen1, 2, 3, ZHANG Xiaohua1, 2, 3, ZHANG Rong1, 2, 3, ZHAO Kun4, WANG Jianzhuang1,2,3. Experimental Analysis of Influencing Factors on Ice-Melting of Loop Heat Pipe [J]. Journal of Chongqing Jiaotong University(Natural Science), 2020, 39(10): 85-92.
[15]	LI Hailian1,2, LIN Mengkai1,2, WANG Qicai1,2. Freeway Asphalt Pavement Performance Based on Fuzzy Interval Evaluation Method [J]. Journal of Chongqing Jiaotong University(Natural Science), 2020, 39(09): 80-87.

Mobile ViT Network Road Defect Detection Model Integrating Multi-task Learning

融合多任务学习的MobileViT网络道路缺陷检测模型

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics