4.3 Article

Role of Data Analytics in Infrastructure Asset Management: Overcoming Data Size and Quality Problems

出版社

ASCE-AMER SOC CIVIL ENGINEERS
DOI: 10.1061/JPEODX.0000175

关键词

Machine learning; Ensemble learning; Transportation asset management; Pavement condition index; Highway maintenance; Data preparation

向作者/读者索取更多资源

This study explores the performance regime of different classification algorithms as they are applied to the analysis of asphalt pavement deterioration data. The aim is to examine how different algorithms deal with the typically limited and low-quality data sets in the infrastructure asset management domain, and whether better configurations of relevant algorithms help overcome these limitations. Furthermore, the emphasis on choosing the most affordable attributes (e.g., temperature and precipitation levels) makes the results reproducible to smaller municipalities. This analysis used the data of more than 3,000 examples of road sections, which were retrieved from the Long-Term Pavement Performance (LTPP) database. The algorithms examined in this study include two types of decision trees, naive Bayes classifier, naive Bayes coupled with kernels, logistic regression, k-nearest neighbors (k-NN), random forest, and gradient boosted trees. The performance of these algorithms is compared, and their weaknesses and strengths are discussed. They were all applied to predict the deterioration of pavement condition index (PCI). Of specific importance is the positive role of ensemble learning. It is shown how using higher efficiencies by using ensemble learning can compensate for data shortcomings. The accuracy of some of the models in predicting the PCI after 3 years exceeded 90%. Suggestions are made to improve the performance of some algorithms. For instance, the naive Bayes classifier was coupled with kernel estimates to achieve a better accuracy. It is demonstrated that using kernel estimates can increase the accuracy of the naive Bayes classifier dramatically. Further, the study examines the impact of data segmentation. Data were divided into four different climatic regions. The accuracy of prediction was sufficiently high after segmentation, with the highest accuracy in the dry and nonfreeze zone and the lowest performance in the region with a wet and freezing climate.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据