期刊
ACCIDENT ANALYSIS AND PREVENTION
卷 162, 期 -, 页码 -出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.aap.2021.106422
关键词
Automated Enforcement System; Traffic Violation; Random Forest; Imbalance Ratio
类别
资金
- National Key R&D Program of China [2018YFB1601600]
The study analyzed factors influencing traffic violations and predicted the probability of violations using both logistic regression and random forest algorithm. Results showed that certain factors like time period, location, and traffic conditions can increase the likelihood of traffic violations. Additionally, using the ProWSyn method to handle imbalanced data improved the random forest algorithm's performance in predicting traffic violations.
The automated enforcement system (AES) is an effective way of supplementing traditional traffic enforcement, and the traffic violation data from AES can also be effectively used for safety research. In this study, traffic violation data were used to analyze the influencing factors associated with traffic violations and to predict the probability of violations at intersections. The potential factors influencing violations include 24 independent factors related to time, space, traffic and weather. Results from a logistic model showed that the midday period, weekends, residential districts, collector roads, congested traffic conditions, high traffic flow, lower wind speed and low temperature would increase the probability of traffic violations. The probability of violations was predicted by the random forest algorithm, which was proven to be the best traffic violation prediction model among logistic regression, Gaussian naive Bayes, and support vector machine. Moreover, the proximity weighted synthetic oversampling technique (ProWSyn) method was applied to reduce the impact of the imbalance ratio (IR) and improve the model's prediction performance. The receiver operating characteristics (ROC) curves and Precision-Recall (PR) curves illustrated that the random forest algorithm using oversampling data had the best classifier prediction performance than undersampling data. The area under curve (AUC) and out-of-bag (OOB) error with IR = 1 reached 0.914 and 0.0787, which showed the better performance of the random forest algorithm using ProWSyn in dealing with imbalanced traffic violation data.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据