☆ 4.5 Article

Comparing Large Covariance Matrices under Weak Conditions on the Dependence Structure and Its Application to Gene Clustering

BIOMETRICS (2017)

期刊

BIOMETRICS

卷 73, 期 1, 页码 31-41

出版社

WILEY

DOI: 10.1111/biom.12552

关键词

Differential expression analysis; Gene clustering; High dimension; Hypothesis testing; Parametric bootstrap; Sparsity

类别

Biology Mathematical & Computational Biology Statistics & Probability

资金

Fundamental Research Funds for the Central Universities [JBK160159, JBK150501, JBK140507, JBK120509]
NSFC [11501462]
Center of Statistical Research at SWUFE
Australian Research Council
NSF [IIS-1545994, NSF DMS-1512267]
Direct For Mathematical & Physical Scien
Division Of Mathematical Sciences [1512267] Funding Source: National Science Foundation
Div Of Information & Intelligent Systems
Direct For Computer & Info Scie & Enginr [1545994] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN.

Comparing Large Covariance Matrices under Weak Conditions on the Dependence Structure and Its Application to Gene Clustering

期刊

BIOMETRICS

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Comparing Large Covariance Matrices under Weak Conditions on the Dependence Structure and Its Application to Gene Clustering

期刊

BIOMETRICS

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文