Journal
ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE
Volume 659, Issue 1, Pages 48-62Publisher
SAGE PUBLICATIONS INC
DOI: 10.1177/0002716215570279
Keywords
big data; machine learning; predictive modeling; data science; penalized regression; ensemble learning; the Lasso
Ask authors/readers for more resources
Analytic techniques developed for big data have much broader applications in the social sciences, outperforming standard regression models evenor rather especiallyin smaller datasets. This article offers an overview of machine learning methods well-suited to social science problems, including decision trees, dimension reduction methods, nearest neighbor algorithms, support vector models, and penalized regression. In addition to novel algorithms, machine learning places great emphasis on model checking (through holdout samples and cross-validation) and model shrinkage (adjusting predictions toward the mean to reduce overfitting). This article advocates replacing typical regression analyses with two different sorts of models used in concert. A multi-algorithm ensemble approach should be used to determine the noise floor of a given dataset, while simpler methods such as penalized regression or decision trees should be used for theory building and hypothesis testing.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available