中文核心期刊
CSCD来源期刊
中国科技核心期刊
RCCSE中国核心学术期刊

重庆交通大学学报(自然科学版) ›› 2024, Vol. 43 ›› Issue (2): 57-64.DOI: 10.3969/j.issn.1674-0696.2024.02.08

• 交通+大数据人工智能 • 上一篇    

基于BiViTNet的轻量级驾驶员分心行为检测方法

高尚兵1,2,张莹莹1,2,王 腾1,2,张秦涛1,刘宇1   

  1. (1. 淮阴工学院 计算机与软件工程学院,江苏 淮安 223003; 2. 江苏省物联网移动互联网技术工程实验室,江苏 淮安 223001)
  • 收稿日期:2023-03-05 修回日期:2023-11-03 发布日期:2024-03-01
  • 作者简介:高尚兵(1981—),男,江苏淮安人,教授,博士,主要从事计算机视觉、智能交通方面的研究。E-mail:luxiaofen_2002@126.com 通信作者:张莹莹(1998—),女,河南许昌人,硕士,主要从事智能交通方面的研究。E-mail:2742158787@qq.com
  • 基金资助:
    国家自然科学基金面上项目(62076107);国家重点研发计划项目(2018YFB1004904);江苏省高校自然科学研究重大项目(18KJA520001)

A Lightweight Driver Distraction Behavior Detection Method Based on BiViTNet

GAO Shangbing1,2, ZHANG Yingying1,2, WANG Teng1,2, ZHANG Qintao1, LIU Yu1   

  1. (1.College of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian 223003, Jiangsu, China; 2.Engineering Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province, Huaian 223001, Jiangsu, China)
  • Received:2023-03-05 Revised:2023-11-03 Published:2024-03-01

摘要: 针对基于卷积神经网络的驾驶员分心行为检测,模型比较复杂、检测效率低下且缺少全局视觉表征的问题,提出了一种双分支并行双向交互神经网络BiViTNet(bidirectional interaction neural network based on vision transformer)对驾驶员行为进行识别,将ViT(vision transformer)引入到网络中对全局信息进行编码,在一定程度上提高检测精度。该网络由两个并行分支组成,第1个分支基于轻量级的CNN结构,第2个分支基于ViT结构。通过双向特征交互模块BiFIM(bidirectional feature interaction module)解决CNN Branch和ViT Branch之间特征不对称的问题,最后将两个分支的特征融合并对驾驶员行为进行检测。实验在自建的多视角驾驶员数据集上展开,验证集准确率达到97.18%,参数量为38.22 MB,计算量为271.20×106。研究表明:轻量级BiViTNet提高了驾驶员分心行为识别的准确率,可以在一定程度上辅助驾驶员的行车安全。

关键词: 交通运输工程;智能交通;分心行为检测;双分支并行双向交互神经网络;视觉转换器;轻量级模型

Abstract: To address the issues of complex models, low detection efficiency, and lack of global visual representation in driver distraction behavior detection based on convolutional neural networks, a bidirectional interaction neural network based on vision transformer (BiViTNet) was proposed to identify driver behavior. ViT (vision transformer) was introduced into the network to encode global information, which could improve the detection accuracy to a certain extent. The proposed network consisted of two parallel branches, and the first one was based on the lightweight CNN structure and the second one was based on the ViT structure. The bidirectional feature interaction module (BiFIM) was used to solve the problem of feature asymmetry between CNN branch and ViT branch. Finally, the features of the two branches were fused and driver behaviors were detected. The experiment was carried out on the self-built multi-view driver dataset. The accuracy of the verification set reached 97.18%, the parameter quantity was 38.22 MB, and the MAdds was 271.20×106. The research shows that lightweight BiViTNet improves the accuracy of drivers distracted behavior identification and can assist drivers driving safety to a certain extent.

Key words: traffic and transportation engineering; intelligent transportation; distracted behavior detection; BiViTNet; visual transformer; lightweight model

中图分类号: