4.7 Article

Principal Component Analysis: A Natural Approach to Data Exploration

Journal

ACM COMPUTING SURVEYS
Volume 54, Issue 4, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3447755

Keywords

Statistical methods; principal component analysis; dimensionality reduction; data visualization; covariance and correlation

Funding

  1. CAPES [001]
  2. CNPq [307333/2013-2, 140442/2019-7, 158128/2017-6]
  3. FAPESP [11/50761-2, 2015/22308-2, 2018/10489-0, 2019/16223-5, 18/09125-4, 16/19069-9, 17/13464-6]
  4. NAP-PRP-USP
  5. Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP) [17/13464-6] Funding Source: FAPESP

Ask authors/readers for more resources

PCA is commonly used for data analysis in various fields, and this work presents theoretical and practical aspects of PCA in an accessible manner. The basic principles, data standardization, visualizations, and outlier detection of PCA are discussed, along with its potential for dimensionality reduction. The work also summarizes PCA-related approaches and aims to assist researchers from diverse areas in utilizing and interpreting PCA effectively.
Principal component analysis (PCA) is often applied for analyzing data in the most diverse areas. This work reports, in an accessible and integrated manner, several theoretical and practical aspects of PCA. The basic principles underlying PCA, data standardization, possible visualizations of the PCA results, and outlier detection are subsequently addressed. Next, the potential of using PCA for dimensionality reduction is illustrated on several real-world datasets. Finally, we summarize PCA-related approaches and other dimensionality reduction techniques. All in all, the objective of this work is to assist researchers from the most diverse areas in using and interpreting PCA.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available