4.5 Article

Teaching Principal Component Analysis Using a Free and Open Source Software Program and Exercises Applying PCA to Real-World Examples

Journal

JOURNAL OF CHEMICAL EDUCATION
Volume 97, Issue 6, Pages 1666-1676

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jchemed.9b00924

Keywords

Upper-Division Undergraduate; Analytical Chemistry; Chemoinformatics; Interdisciplinary/Multidisciplinary; Calculator-Based Learning; Chemometrics; Computational Chemistry

Funding

  1. FAPESC (Fundacao de Amparo a Pesquisa do Estado de Santa Catarina)
  2. CNPq (Conselho Nacional de Desenvolvimento Cientifico e Tecnologico) [399402226/2016-0]
  3. CAPES (Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior)

Ask authors/readers for more resources

Principal component analysis (PCA) is one of the most important and powerful methods in chemometrics as well as in a wealth of other areas. Running a PCA results in two main elements, the score plot and the loading plot; the score plot provides the location of the samples, and the loading plot indicates correlations among variables, the trends in the grouping of samples, and the most important variables. In the past 10 years teaching chemometrics, we have struggled with not having free software with an easy to use graphical user interface for data handling and calculations. In this paper, we provide a series of examples that students used to carry PCA in R-Project, a free and open source software program. In the first example, students used PCA to find correlations among chemical properties of chemical elements and relate these properties with the periodic distribution of the elements. In the second example, meat samples were grouped using 14 variables, and students could observe how outlier samples might influence the PCA model; in this case, they were also taught how to use the t test to choose the variables that were significant to the PCA model. In the third example, healthy patients were differentiated from diabetic patients using 163 lipid concentrations. In the fourth example, Atlantic salmon samples were differentiated from catfish samples. In the fifth and sixth examples, students were able to observe how data treatment affects the classification of natural products and edible oils, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available