4.6 Article

Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data

期刊

IEEE ACCESS
卷 11, 期 -, 页码 13594-13604

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3243573

关键词

Data visualization; Principal component analysis; Sociology; Dimensionality reduction; Smoothing methods; Topology; Periodic structures; multidimensional scaling; PCA; population structure; single nucleotide polymorphisms

向作者/读者索取更多资源

Population structure can be revealed using Single Nucleotide Polymorphisms (SNPs). Principal Component Analysis (PCA) has been widely used for visualizing SNP data, but other dimensionality reduction methods may be more successful. However, these techniques often struggle with preserving the global structure in SNP data or have high computational cost. In this study, a method called Multidimensional Scaling (MDS) of smoothed PCA-transformed data (MSSPD) is proposed, which successfully reveals population structures in 2D maps and is computationally efficient compared to other methods.
Population structure can be revealed using Single Nucleotide Polymorphisms (SNPs) which are genetic variations found in the DNA sequences of individuals. Due to the large number of SNPs, visualization of SNP data is often achieved through dimensionality reduction. Although Principal Component Analysis (PCA) has been extensively used for SNP data visualization, some other dimensionality reduction methods have been shown to be more successful in revealing complex population structures. Nevertheless, these techniques often suffer from reduced ability to preserve the global structure in the SNP data, namely the relative genetic distance between subpopulations, or from high computational cost. In this work, a method which uses Multidimensional Scaling (MDS) of smoothed PCA-transformed data (MSSPD) is proposed. MSSPD successfully reveals population structures in 2D maps, while being more effective than other techniques in preserving the global structure. In terms of computational efficiency, MSSPD is comparable to the fastest SNP visualization methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据