☆ 4.4 Article

Mining Big Data to Extract Patterns and Predict Real-Life Outcomes

PSYCHOLOGICAL METHODS (2016)

期刊

PSYCHOLOGICAL METHODS

卷 21, 期 4, 页码 493-506

出版社

AMER PSYCHOLOGICAL ASSOC

DOI: 10.1037/met0000105

关键词

computational social science; big data; digital footprints; R; personality

类别

Psychology, Multidisciplinary

资金

Robert Bosch Stanford Graduate Fellowship
Google Faculty Research Award
National Science Foundation
Defense Advanced Research Projects Agency (DARPA)
Stanford Center for the Study of Language and Information

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This article aims to introduce the reader to essential tools that can be used to obtain insights and build predictive models using large data sets. Recent user proliferation in the digital environment has led to the emergence of large samples containing a wealth of traces of human behaviors, communication, and social interactions. Such samples offer the opportunity to greatly improve our understanding of individuals, groups, and societies, but their analysis presents unique methodological challenges. In this tutorial, we discuss potential sources of such data and explain how to efficiently store them. Then, we introduce two methods that are often employed to extract patterns and reduce the dimensionality of large data sets: singular value decomposition and latent Dirichlet allocation. Finally, we demonstrate how to use dimensions or clusters extracted from data to build predictive models in a cross-validated way. The text is accompanied by examples of R code and a sample data set, allowing the reader to practice the methods discussed here. A companion website (http://dataminingtutorial.com) provides additional learning resources.

Mining Big Data to Extract Patterns and Predict Real-Life Outcomes

期刊

PSYCHOLOGICAL METHODS

出版社

AMER PSYCHOLOGICAL ASSOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Mining Big Data to Extract Patterns and Predict Real-Life Outcomes

期刊

PSYCHOLOGICAL METHODS

出版社

AMER PSYCHOLOGICAL ASSOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文