3.8 Proceedings Paper

Visualizing Population Substructures Using Multidimensional Scaling and Data Smoothing

出版社

IEEE
DOI: 10.1109/CSCI54926.2021.00127

关键词

multidimensional scaling; principal component analysis; t-distributed stochastic neighbor embedding; single nucleotide polymorphisms; population structure analysis

向作者/读者索取更多资源

This study compares two visualization techniques, MDS and t-SNE, for detecting population substructures in genetic data. While both methods successfully reveal substructures in 2D, the MDS-based method is better at preserving relative similarity between populations.
Single Nucleotide Polymorphisms (S N Ps) present an important component of a genome's information and have been extensively used in genetics for population structure analysis. SNP data visualization assists in detecting population substructures. However, SNP sequences include thousands or millions of data points. One way to visualize SNP data is through dimensionality reduction. Principal Component Analysis (PCA) has been traditionally used for reducing dimensionality to 2D or 3D with reasonably acceptable outcomes. However, visualizing complex population structures requires more advanced techniques. Recently, t-Distributed Stochastic Neighbor Embedding (t-SNE) has been used for SNP visualization. In this work, a Multidimensional Scaling (MDS)-based method is presented and compared with t-SNE. Although both techniques successfully reveal population substructures in 2D, the MDS-based method better preserves the relative similarity between populations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据