☆ 4.8 Review

Challenges of Big Data analysis

NATIONAL SCIENCE REVIEW (2014)

期刊

NATIONAL SCIENCE REVIEW

卷 1, 期 2, 页码 293-314

出版社

OXFORD UNIV PRESS

DOI: 10.1093/nsr/nwt032

关键词

Big Data; noise accumulation; spurious correlation; incidental endogeneity; data storage; scalability

类别

Multidisciplinary Sciences

资金

National Science Foundation [DMS-1206464, III-1116730, III-1332109]
National Institutes of Health [R01-GM100474, R01-GM072611]
Division Of Mathematical Sciences
Direct For Mathematical & Physical Scien [1206464] Funding Source: National Science Foundation
Div Of Information & Intelligent Systems
Direct For Computer & Info Scie & Enginr [1332109] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Big Data bring new opportunities to modern society and challenges to data scientists. On the one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This paper gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogenous assumptions in most statistical methods for Big Data cannot be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

Challenges of Big Data analysis

期刊

NATIONAL SCIENCE REVIEW

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Challenges of Big Data analysis

期刊

NATIONAL SCIENCE REVIEW

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文