4.4 Article

Dimension-wise sparse low-rank approximation of a matrix with application to variable selection in high-dimensional integrative analyzes of association

Journal

JOURNAL OF APPLIED STATISTICS
Volume 49, Issue 15, Pages 3889-3907

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/02664763.2021.1967892

Keywords

High dimension low sample size; multimodal data; nuclear norm; sparse canonical correlation analysis

Ask authors/readers for more resources

A proposed method aims to characterize the dominant modes of co-variation between variables in two datasets while performing variable selection accurately. The method relies on a sparse, low rank approximation of a matrix containing pairwise association measures between variables from the two sets, closely related to sparse canonical correlation analysis methods. Through simulations, it is shown that the proposed method outperforms state-of-the-art sparse CCA algorithms in terms of variable selection accuracies.
Many research proposals involve collecting multiple sources of information from a set of common samples, with the goal of performing an integrative analysis describing the associations between sources. We propose a method that characterizes the dominant modes of co-variation between the variables in two datasets while simultaneously performing variable selection. Our method relies on a sparse, low rank approximation of a matrix containing pairwise measures of association between the two sets of variables. We show that the proposed method shares a close connection with another group of methods for integrative data analysis - sparse canonical correlation analysis (CCA). Under some assumptions, the proposed method and sparse CCA aim to select the same subsets of variables. We show through simulation that the proposed method can achieve better variable selection accuracies than two state-of-the-art sparse CCA algorithms. Empirically, we demonstrate through the analysis of DNA methylation and gene expression data that the proposed method selects variables that have as high or higher canonical correlation than the variables selected by sparse CCA methods, which is a rather surprising finding given that objective function of the proposed method does not actually maximize the canonical correlation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available