4.7 Article

Estimation of Poverty Using Random Forest Regression with Multi-Source Data: A Case Study in Bangladesh

Journal

REMOTE SENSING
Volume 11, Issue 4, Pages -

Publisher

MDPI
DOI: 10.3390/rs11040375

Keywords

poverty; random forest regression; Bangladesh; nighttime light; Google satellite imagery

Funding

  1. National Natural Science Foundation of China [41871331, 41801343]
  2. Australian Research Council [DP170104235]
  3. China Scholarship Council [201706140143]

Ask authors/readers for more resources

Spatially explicit and reliable data on poverty is critical for both policy makers and researchers. However, such data remain scarce particularly in developing countries. Current research is limited in using environmental data from different sources in isolation to estimate poverty despite the fact that poverty is a complex phenomenon which cannot be quantified either theoretically or practically by one single data type. This study proposes a random forest regression (RFR) model to estimate poverty at 10 km x 10 km spatial resolution by combining features extracted from multiple data sources, including the National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) Day/Night Band (DNB) nighttime light (NTL) data, Google satellite imagery, land cover map, road map and division headquarter location data. The household wealth index (WI) drawn from the Demographic and Health Surveys (DHS) program was used to reflect poverty level. We trained the RFR model using data in Bangladesh and applied the model to both Bangladesh and Nepal to evaluate the model's accuracy. The results show that the R-2 between the actual and estimated WI in Bangladesh is 0.70, indicating a good predictive power of our model in WI estimation. The R-2 between actual and estimated WI of 0.61 in Nepal also indicates a good generalization ability of the model. Furthermore, a negative correlation is observed between the district average WI and the poverty head count ratio (HCR) in Bangladesh with the Pearson Correlation Coefficient of -0.6. Using Gini importance, we identify that proximity to urban areas is the most important variable to explain poverty which contribute to 37.9% of the explanatory power. Compared to the study that used NTL and Google satellite imagery in isolation to estimate poverty, our method increases the accuracy of estimation. Given that the data we use are globally and publicly available, the methodology reported in this study would also be applicable in other countries or regions to estimate the extent of poverty.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available