4.6 Article

TESTING INDEPENDENCE WITH HIGH-DIMENSIONAL CORRELATED SAMPLES

期刊

ANNALS OF STATISTICS
卷 46, 期 2, 页码 866-894

出版社

INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/17-AOS1571

关键词

Independence test; multiple testing of correlations; false discovery rate; matrix-variate normal; quadratic functional estimation; high-dimensional sample correlation matrix

资金

  1. NSFC [11431006]
  2. Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning
  3. Shanghai Shuguang Program
  4. Youth Talent Support Program
  5. 973 Program [2015CB856004]
  6. Australian Research Council

向作者/读者索取更多资源

Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging n identically distributed p-dimensional random vectors into a p x n data matrix, we investigate the problem of testing independence among columns under the matrix-variate normal modeling of data. We propose a computationally simple and tuning-free test statistic, characterize its limiting null distribution, analyze the statistical power and prove its minimax optimality. As an important by-product of the test statistic, a ratio-consistent estimator for the quadratic functional of a covariance matrix from correlated samples is developed. We further study the effect of correlation among samples to an important high-dimensional inference problem-large-scale multiple testing of Pearson's correlation coefficients. Indeed, blindly using classical inference results based on the assumed independence of samples will lead to many false discoveries, which suggests the need for conducting independence testing before applying existing methods. To address the challenge arising from correlation among samples, we propose a sandwich estimator of Pearson's correlation coefficient by de-correlating the samples. Based on this approach, the resulting multiple testing procedure asymptotically controls the overall false discovery rate at the nominal level while maintaining good statistical power. Both simulated and real data experiments are carried out to demonstrate the advantages of the proposed methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据