Detecting Anomalies in National Bridge Inventory Databases Using Machine Learning Methods
Author(s): |
Ehsan Fereshtehnejad
Gianluca Gazzola Pratik Parekh Chirag Nakrani Hooman Parvardeh |
---|---|
Medium: | journal article |
Language(s): | English |
Published in: | Transportation Research Record: Journal of the Transportation Research Board, 3 March 2022, n. 6, v. 2676 |
Page(s): | 453-467 |
DOI: | 10.1177/03611981221075028 |
Abstract: |
National Bridge Inventory (NBI) data is regularly collected for 617,000+ national bridges in the U.S. These data, which consist of 100+ fields related to bridges and culverts, have been shown to contain errors. These errors could reduce the effectiveness of the decisions made based on this data, and cause safety issues. For this reason, an anomaly detection platform is developed to identify data anomalies in NBI datasets more effectively than existing rule-based error-check tools can. First, the user provides groups of correlated NBI fields as input to the platform. Then, for each group, it utilizes two tools to detect anomalous data and determine errors. The first tool uses three machine learning algorithms to identify anomalous data points and categorizes them based on their degree of anomaly. The second tool visualizes the distributions of the NBI fields in the group with histograms, scatter plots, and so forth. These plots are used to analyze the data points that are identified from the first tool as anomalies. The results of these two tools, together with expert knowledge about the data fields, are then used to distinguish data errors from outliers. The proposed platform is applied to a state’s NBI dataset that was submitted to the Federal Highway Administration (FHWA) in 2020. For this dataset, two groups of correlated fields are considered. The results showed the platform could effectively pinpoint anomalous values of NBI fields that individually, or in conjunction with other fields, do not follow the patterns that characterize most of the data, prompting the identification of potential inconsistencies and errors. |
- About this
data sheet - Reference-ID
10777889 - Published on:
12/05/2024 - Last updated on:
12/05/2024