Information Extraction Method for Bridge Inspection Reports
Based on NuNER

doi:10.3969/j.issn.1674-0696.2026.03.07

Abstract

Abstract: Bridge inspection reports are typically stored in electronic documents, and the utilization rate of information such as defect descriptions, damage causes, and technical indicators contained therein is not high. Existing methods often rely on general-purpose pre-trained language models like BERT. Due to the lack of professionalism in the bridge field in the training corpus, it is easy to cause incomplete or incorrect recognition of professional terminology. To address this issue, an information extraction method based on the NuNER model was proposed. In the proposed method, large language models were used for automatic data annotation and multi-level semantic features were integrated with concept encoders, thereby enhancing the ability to model professional entities and long-range dependencies. A bridge inspection corpus was constructed, containing 9 types of information, 1 624 samples, and a total of 11 450 key information. Domain fine-tuning for NuNER was conducted on the basis of this corpus. Research results show that the proposed method significantly outperforms baseline models in domain entity recognition, with the F1 score increased to 0.920 6. The proposed model exhibits excellent accuracy and recall rates in extracting key information such as the quantity and distribution of diseases, verifying its effectiveness in professional information extraction for bridge inspection. The proposed method can effectively improve the efficiency of extracting bridge management and maintenance information and lays a solid foundation for subsequent construction of knowledge graph, decision support systems and intelligent question-answering platforms, showing broad application prospects.

Key words: bridge engineering; information extraction; pre-trained language model; bridge inspection report; deep learning; natural language processing

摘要： 桥梁检测报告通常以电子文档形式存储，所包含的病害描述、损伤原因及技术指标等信息的利用率往往不高。现有方法多依赖BERT等通用预训练语言模型，因训练语料缺乏桥梁领域专业性，极易导致专业术语识别不全或错误。为此，提出了一种基于NuNER模型的信息抽取方法。该方法借助大语言模型自动标注数据，融合多层次语义特征与概念编码器，增强对专业实体及长距离依赖关系的建模能力；构建了包含9类信息、1 624个样本、共11 450个关键信息的桥检语料库，并基于该语料库对NuNER进行领域微调。研究结果表明：该方法在领域实体识别上显著优于基线模型，F1值提升至0.920 6，在病害数量与分布等关键信息抽取中准确率与召回率表现优异，验证了其在桥检专业信息抽取中的有效性。该方法能有效提高桥梁管养信息的提取效率，并为后续知识图谱、决策支持系统及智能问答平台的建立奠定坚实基础，具有广阔的应用前景。

关键词: 桥梁工程；信息抽取；预训练语言模型；桥梁检测报告；深度学习；自然语言处理

CLC Number:

U446.3

LIU Ning1, DAI Xinjun2, WANG Yuchen3, ZHU Yanjie3. Information Extraction Method for Bridge Inspection Reports Based on NuNER[J]. Journal of Chongqing Jiaotong University(Natural Science), 2026, 45(3): 57-64.

刘宁1，戴新军2，王瑜晨3，朱彦洁3. 基于NuNER的桥梁检测报告信息抽取方法[J]. 重庆交通大学学报（自然科学版）, 2026, 45(3): 57-64.

References

［1］杨小霞, 杨建喜, 李韧, 等. 桥梁检测领域知识图谱构建与知识问答方法［J］. 计算机应用, 2022, 42(增刊1): 28-36.
YANG Xiaoxia, YANG Jianxi, LI Ren, et al. Construction of knowledge map and knowledge question-and-answer method in bridge detection field［J］.Journal of Computer Applications, 2022, 42(Sup 1): 28-36.
［2］邬晓光,徐凯澳,黄骞.基于深度学习的重载车辆作用下桥梁动力响应预测方法研究［J］.重庆交通大学学报(自然科学版), 2025, 44(5): 19-26.
WU Xiaoguang, XU Kaiao, HUANG Qian. Method for predicting the dynamic response of bridges under heavy-load vehicles based on deep learning［J］. Journal of Chongqing Jiaotong University(Natural Science), 2025, 44(5): 19-26.
［3］张胜林,周水兴,耿川雁,等.基于粗糙集理论的混凝土拱桥加固或拆除方案综合判定［J］.重庆交通大学学报(自然科学版), 2023, 42(10):38-44.
ZHANG Shenglin, ZHOU Shuixing, GENG Chuanyan, et al. Comprehensive judgment of reinforcement or demolition scheme of concrete arch bridge based on rough set theory［J］. Journal of Chongqing Jiaotong University(Natural Science), 2023, 42(10): 38-44.
［4］莫天金, 李韧, 杨建喜, 等. 公路桥梁定期检测领域命名实体识别语料库构建［J］. 计算机应用, 2020, 40(增刊1): 103-108.
MO Tianjin, LI Ren, YANG Jianxi, et al. Construction of named entity recognition corpus for periodic inspection of highway bridges［J］.Journal of Computer Applications, 2020, 40(Sup 1): 103-108.
［5］奚雪峰, 周国栋. 面向自然语言处理的深度学习研究［J］. 自动化学报, 2016, 42(10): 1445-1465.
XI Xuefeng, ZHOU Guodong. A survey on deep learning for natural language processing［J］. Acta Automatica Sinica, 2016, 42(10): 1445-1465.
［6］ KOROTEEV M V. BERT: A review of applications in natural language processing and understanding［J］. ArXiv, 2021, 2103: 11943. https:∥DOI.org/10.48550/ arXiv.2103.11943.
［7］朱鹤, 陆小锋, 薛雷. 基于BERT的金融文本情感分析模型［J］. 上海大学学报(自然科学版), 2023, 29(1): 118-128.
ZHU He, LU Xiaofeng, XUE Lei. Emotional analysis model of financial text based on the BERT［J］.Journal of Shanghai University (Natural Science Edition), 2023, 29(1): 118-128.
［8］夏祺霖. 基于MedBERT模型的NLP技术标准在智能医疗辅助决策中的应用［J］. 大众标准化, 2024(12): 145-147.
XIA Qilin. Application of NLP technology standard based on MedBERT model in intelligent medical assistant decision-making［J］.Popular Standardization, 2024(12): 145-147.
［9］王宁, 刘玮, 兰剑. 基于法院判决文书的法律知识图谱构建和补全［J］. 郑州大学学报(理学版), 2021, 53(3): 23-29.
WANG Ning, LIU Wei, LAN Jian. Construction and completion of legal knowledge graph based on court judgment documents［J］. Journal of Zhengzhou University (Natural Science Edition), 2021, 53(3): 23-29.
［10］覃高杰, 胡建新, 刘大洋, 等. 桥梁定期检测报告专业名词批量检查与修订［J］. 公路交通技术, 2018, 34(2): 66-67.
QIN Gaojie, HU Jianxin, LIU Dayang, et al. Batch checking and revision of technical terms in periodic detection reports of bridge［J］. Technology of Highway and Transport, 2018, 34(2): 66-67.
［11］ WANG Yuchen, ZHU Yanjie, XIONG Wen, et al. A few-shot word-structure embedded model for bridge inspection reports learning［J］. Advanced Engineering Informatics, 2024, 62: 102664.
［12］ BOGDANOV S, CONSTANTIN A, BERNARD T, et al. NuNER: Entity recognition encoder pre-training via LLM-annotated data［C］∥Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. https:∥DOI: 10.18653/v1/2024.emnlp-main.660.
［13］ CHEN Ting, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations［J］. ArXiv, 2020: 1597-1607.
［14］景强, 郑顺潮, 梁鹏, 等. 港珠澳大桥智能化运维技术与工程实践［J］. 中国公路学报, 2023, 36(6): 143-156.
JING Qiang, ZHENG Shunchao, LIANG Peng, et al. Technologies and engineering practices of intelligent operation and maintenance of Hong Kong-Zhuhai-Macao bridge［J］. China Journal of Highway and Transport, 2023, 36(6): 143-156.
［15］朱兵兵, 罗飞, 罗勇军, 等. 基于子句抽取的文本摘要自动提取算法［J］. 华东理工大学学报(自然科学版), 2024, 50(1): 114-120.
ZHU Bingbing, LUO Fei, LUO Yongjun, et al. An automatic text summarization algorithm based on clause extraction［J］. Journal of East China University of Science and Technology, 2024, 50(1): 114-120.
［16］ ZHENG Yuhang, WANG Li, LI Feng, et al. Named entity recognition of PCI surgery information based on BERT+BiLSTM+CRF［M］∥Communications, Signal Processing, and Systems. Singapore: Springer Nature Singapore, 2024: 107-114.

[1]	LIANG Dong1,2, ZHANG Congzheng1, CHEN Lei1. Measure Point Layout of Modal Test of Highway Curved Beam Bridge [J]. Journal of Chongqing Jiaotong University(Natural Science), 2020, 39(06): 46-52.
[2]	WEN Peng, CHEN Qiaofeng, YANG Fengfan. Modal Parameter Identification for Large Span Cable-Stayed Bridge Based on Improved Stochastic Subspace Identification [J]. Journal of Chongqing Jiaotong University(Natural Science), 2020, 39(01): 45-50.
[3]	LAN Zhangli1，TIAN Yuan1，ZHOU Jianting2，YAO Jinqiang1. Anchoring Surface Displacement Monitoring System Based on Differential Amplification Method [J]. Journal of Chongqing Jiaotong University(Natural Science), 2017, 36(1): 9-13.
[4]	CHEN Yonggao. Modal Parameter Identification of Bridge Structure Based on Adaptive EEMD and Blind Identification Algorithm [J]. Journal of Chongqing Jiaotong University(Natural Science), 2016, 35(3): 11-16.
[5]	LIU Laijun,NI Futao,SUN Weigang,SHAO Yongjun,LI Xiao. Application of Multistage Crossing Genetic Algorithms by Optimal Sensor Placement in Dynamic Test of Continuous Rigid Frame Bridge [J]. Journal of Chongqing Jiaotong University(Natural Science), 2016, 35(2): 6-8.
[6]	YUAN Quan, HE Jie. Modal Parameter Identification of Large Cable-Stayed Bridge Based on IEEMD and ARMA Algorithm [J]. Journal of Chongqing Jiaotong University(Natural Science), 2016, 35(1): 10-15.
[7]	Lu Shuang , Zhang Yongshui. Application of Assessment Methods of Bridge Health Monitoring Deflection [J]. Journal of Chongqing Jiaotong University(Natural Science), 2013, 32(6): 1133-1136.
[8]	Lu Shuang , Zhang Yongshui. Application of Assessment Methods of Bridge Health Monitoring Deflection [J]. Journal of Chongqing Jiaotong University(Natural Science), 2013, 32(6): 1133-1136.
[9]	Pan Cong，Liao Bihai. Calculation Method of Checking Coefficient of Spandrel-Filled Arch Bridges Based on Appearance Investigation [J]. Journal of Chongqing Jiaotong University(Natural Science), 2013, 32(增1): 779-783.
[10]	Zhang Xiqiang，Chen Hui，Qi Jiehui. Evaluation Method of Load-Bearing Capacity of Highway Girder Bridge in New Specification [J]. Journal of Chongqing Jiaotong University(Natural Science), 2013, 32(2): 194-0197.
[11]	Zhang Xiqiang，Chen Hui，Qi Jiehui. Evaluation Method of Load-Bearing Capacity of Highway Girder Bridge in New Specification [J]. Journal of Chongqing Jiaotong University(Natural Science), 2013, 32(2): 194-197.
[12]	Yao Guowen，，Chen Yong. Evaluation of Load Capacity and Strengthening Methods for Deteriorating Trussed Combination Arch Bridges [J]. Journal of Chongqing Jiaotong University(Natural Science), 2012, 31(增1): 696-699.
[13]	BI Chen-hua，ZHANG Yong-shui. Damage Analysis on Collision between Over-High Truck and Bridge Based on FEM [J]. Journal of Chongqing Jiaotong University(Natural Science), 2011, 30(6): 1294-1297.
[14]	LIU Ji-zheng，ZHOU Zhi-xiang，CHEN Bo. Exploration of a New Method to Bridge Damage Detection Calcualtion [J]. Journal of Chongqing Jiaotong University(Natural Science), 2007, 26(增刊1): 25-27.
[15]	GAO Fei, SHI Shang-wei, YANG Hua-gang. Inquiring into method of raising adjective evaluation on resistance of bridge [J]. Journal of Chongqing Jiaotong University(Natural Science), 2005, 24(6): 14-17.

Information Extraction Method for Bridge Inspection Reports Based on NuNER

基于NuNER的桥梁检测报告信息抽取方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics