4.6 Article

Predicting the rental value of houses in household surveys in Tanzania, Uganda and Malawi: Evaluations of hedonic pricing and machine learning approaches

Journal

PLOS ONE
Volume 16, Issue 2, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0244953

Keywords

-

Ask authors/readers for more resources

This study examined the predictive power of Machine Learning methods on housing values in household surveys, finding that Boosting, Bagging, Forest, Ridge, and LASSO were the best models for predicting rental values across countries and years. In contrast, the Tree regression model underperformed relative to various OLS models. With abundant data and computing power, ML methods offer a viable alternative for predicting housing values in household surveys.
Housing value is a major component of the aggregate expenditure used in the analyses of welfare status of households in the development economics literature. Therefore, an accurate estimation of housing services is important to obtain the value of housing in household surveys. Data show that a significant proportion of households in a typical Living Standard Measurement Survey (LSMS), adopted by the Word Bank and others, are self-owned. The standard approach to predict the housing value for such surveys is based on the rental cost of the house. A hedonic pricing applying an Ordinary Least Squares (OLS) method is normally used to predict rental values. The literature shows that Machine Learning (ML) methods, shown to uncover generalizable patterns based on a given data, have better predictive power over OLS applied in other valuation exercises. We examined whether or not a class of ML methods (e.g. Ridge, LASSO, Tree, Bagging, Random Forest, and Boosting) provided superior prediction of rental value of housing over OLS methods accounting for spatial autocorrelations using household level survey data from Uganda, Tanzania, and Malawi, across multiple years. Our results showed that the Machine Learning methods (Boosting, Bagging, Forest, Ridge and LASSO) are the best models in predicting house values using out-of-sample data set for all the countries and all the years. On the other hand, Tree regression underperformed relative to the various OLS models, over the same data sets. With the availability of abundant data and better computing power, ML methods provide viable alternative to predicting housing values in household surveys.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available