☆ 4.7 Article

All sparse PCA models are wrong, but some are useful. Part II: Limitations and problems of deflation

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2021)

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

卷 208, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.chemolab.2020.104212

关键词

Artifacts; Data interpretation; Exploratory data analysis; Model interpretation; Sparse principal component analysis; Sparsity

类别

Automation & Control Systems Chemistry, Analytical Computer Science, Artificial Intelligence Instruments & Instrumentation Mathematics, Interdisciplinary Applications Statistics & Probability

资金

Spanish Ministry of Economy and Competitiveness
ERDF (European Regional Development Fund) [TIN2017-83494-R]
Plan Propio de la Universidad de Granada
Netherlands Organisation for Health Research and Development (ZonMW) [456008002]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Sparse Principal Component Analysis (sPCA) is a matrix factorization approach based on Principal Component Analysis (PCA) that aims to improve data interpretation, particularly for high-dimensional biological omics data. Part I of this series highlighted limitations of state-of-the-art sPCA algorithms when modeling noise-free data, while Part II focuses on analyzing the drawbacks of sPCA methods using deflation for calculating subsequent components, showing potential problems in model interpretation even for noise-free data. New diagnostics are proposed to identify modeling issues in real-data analysis.

Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.

All sparse PCA models are wrong, but some are useful. Part II: Limitations and problems of deflation

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

All sparse PCA models are wrong, but some are useful. Part II: Limitations and problems of deflation

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文