4.5 Article

Integrative sparse partial least squares

Journal

STATISTICS IN MEDICINE
Volume 40, Issue 9, Pages 2239-2256

Publisher

WILEY
DOI: 10.1002/sim.8900

Keywords

contrasted penalization; integrative analysis; partial least squares

Funding

  1. 111 Project [B13028]
  2. Fundamental Research Funds for the Central Universities [20720181003]
  3. National Institutes of Health [CA121974, CA196530]
  4. National Natural Science Foundation of China [11971404, 71988101]

Ask authors/readers for more resources

Partial least squares is important for dimension reduction in handling problems with numerous variables. Sparse partial least squares technique helps identify important variables and generate more interpretable results. Integrative analysis gathers raw data from multiple independent datasets and jointly analyzes them to improve performance.
Partial least squares, as a dimension reduction technique, has become increasingly important for its ability to deal with problems with a large number of variables. Since noisy variables may weaken estimation performance, the sparse partial least squares (SPLS) technique has been proposed to identify important variables and generate more interpretable results. However, the small sample size of a single dataset limits the performance of conventional methods. An effective solution comes from gathering information from multiple comparable studies. Integrative analysis has essential importance in multidatasets analysis. The main idea is to improve performance by assembling raw data from multiple independent datasets and analyzing them jointly. In this article, we develop an integrative SPLS (iSPLS) method using penalization based on the SPLS technique. The proposed approach consists of two penalties. The first penalty conducts variable selection under the context of integrative analysis. The second penalty, a contrasted penalty, is imposed to encourage the similarity of estimates across datasets and generate more sensible and accurate results. Computational algorithms are developed. Simulation experiments are conducted to compare iSPLS with alternative approaches. The practical utility of iSPLS is shown in the analysis of two TCGA gene expression data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available