4.6 Article

Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

Journal

ANALYST
Volume 137, Issue 2, Pages 322-332

Publisher

ROYAL SOC CHEMISTRY
DOI: 10.1039/c1an15821j

Keywords

-

Funding

  1. National Biophotonics and Imaging Platform (NBIP) Ireland
  2. Higher Education Authority
  3. Irish Government
  4. European Union

Ask authors/readers for more resources

K-means clustering followed by Principal Component Analysis (PCA) is employed to analyse Raman spectroscopic maps of single biological cells. K-means clustering successfully identifies regions of cellular cytoplasm, nucleus and nucleoli, but the mean spectra do not differentiate their biochemical composition. The loadings of the principal components identified by PCA shed further light on the spectral basis for differentiation but they are complex and, as the number of spectra per cluster is imbalanced, particularly in the case of the nucleoli, the loadings under-represent the basis for differentiation of some cellular regions. Analysis of pure bio-molecules, both structurally and spectrally distinct, in the case of histone, ceramide and RNA, and similarly in the case of the proteins albumin, collagen and histone, show the relative strong representation of spectrally sharp features in the spectral loadings, and the systematic variation of the loadings as one cluster becomes reduced in number. The more complex cellular environment is simulated by weighted sums of spectra, illustrating that although the loading becomes increasingly complex; their origin in a weighted sum of the constituent molecular components is still evident. Returning to the cellular analysis, the number of spectra per cluster is artificially balanced by increasing the weighting of the spectra of smaller number clusters. While it renders the PCA loading more complex for the three-way analysis, a pair wise analysis illustrates clear differences between the identified subcellular regions, and notably the molecular differences between nuclear and nucleoli regions are elucidated. Overall, the study demonstrates how appropriate consideration of the data available can improve the understanding of the information delivered by PCA.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available