4.7 Article

Finite mixture Negative Binomial-Lindley for modeling heterogeneous crash data with many zero observations

Journal

ACCIDENT ANALYSIS AND PREVENTION
Volume 175, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.aap.2022.106765

Keywords

Negative Binomial -Lindley; Finite mixture; Bayesian analysis; Crash data; Negative binomial

Ask authors/readers for more resources

Crash data often have characteristics such as high dispersion, numerous zero observations, and long tail, which cannot be properly modeled by the traditional Negative Binomial (NB) model. The Negative Binomial-Lindley (NB-L) model has been proposed as an alternative, and research studies have shown its superior performance in analyzing such data. Additionally, crash data are often collected from different subpopulations, and finite mixture models can be used to capture population heterogeneity. The Finite mixture NB-L model (FMNB-L) is introduced to analyze crash data from heterogeneous subpopulations with many zero observations and a long tail, and it has been found to provide a significantly better fit compared to other models.
Crash data are often highly dispersed; it may also include a large amount of zero observations or have a long tail. The traditional Negative Binomial (NB) model cannot model these data properly. To overcome this issue, the Negative Binomial-Lindley (NB-L) model has been proposed as an alternative to the NB to analyze data with these characteristics. Research studies have shown that the NB-L model provides a superior performance compared to the NB when data include numerous zero observations or have a long tail. In addition, crash data are often collected from sites with different spatial or temporal characteristics. Therefore, it is not unusual to assume that crash data are drawn from multiple subpopulations. Finite mixture models are powerful tools that can be used to account for underlying subpopulations and capture the population heterogeneity. This research docu-ments the derivations and characteristics of the Finite mixture NB-L model (FMNB-L) to analyze data generated from heterogeneous subpopulations with many zero observations and a long tail. We demonstrated the appli-cation of the model to identify subpopulations with a simulation study. We then used the FMNB-L model to estimate statistical models for Texas four-lane freeway crashes. These data have unique characteristics; it is highly dispersed, have many locations with very large number of crashes, as well as significant number of lo-cations with zero crash. We used multiple goodness-of-fit metrics to compare the FMNB-L model with the NB, NB-L, and the finite mixture NB models. The FMNB-L identified two subpopulations in datasets. The results show a significantly better fit by the FMNB-L compared to other analyzed models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available