4.7 Article

Chromosome-scale inference of hybrid speciation and admixture with convolutional neural networks

期刊

MOLECULAR ECOLOGY RESOURCES
卷 21, 期 8, 页码 2676-2688

出版社

WILEY
DOI: 10.1111/1755-0998.13355

关键词

admixture; convolutional neural networks; deep learning; gene flow; hybridization; model selection

资金

  1. National Institute of General Medical Sciences [R01GM127348]
  2. National Science Foundation [IOS-1811784]

向作者/读者索取更多资源

In order to understand the process of speciation and uncover phylogenetic patterns, researchers use a deep learning method like CNNs to infer the frequency and mode of hybridization among closely related organisms. By analyzing genealogical discordance and selecting hybridization scenario models, this approach helps to better comprehend patterns of admixture, especially when dealing with closely linked data where nonindependence needs to be considered.
Inferring the frequency and mode of hybridization among closely related organisms is an important step for understanding the process of speciation and can help to uncover reticulated patterns of phylogeny more generally. Phylogenomic methods to test for the presence of hybridization come in many varieties and typically operate by leveraging expected patterns of genealogical discordance in the absence of hybridization. An important assumption made by these tests is that the data (genes or SNPs) are independent given the species tree. However, when the data are closely linked, it is especially important to consider their nonindependence. Recently, deep learning techniques such as convolutional neural networks (CNNs) have been used to perform population genetic inferences with linked SNPs coded as binary images. Here, we use CNNs for selecting among candidate hybridization scenarios using the tree topology (((P-1, P-2), P-3), Out) and a matrix of pairwise nucleotide divergence (d(XY)) calculated in windows across the genome. Using coalescent simulations to train and independently test a neural network showed that our method, HyDe-CNN, was able to accurately perform model selection for hybridization scenarios across a wide breath of parameter space. We then used HyDe-CNN to test models of admixture in Heliconius butterflies, as well as comparing it to phylogeny-based introgression statistics. Given the flexibility of our approach, the dropping cost of long-read sequencing and the continued improvement of CNN architectures, we anticipate that inferences of hybridization using deep learning methods like ours will help researchers to better understand patterns of admixture in their study organisms.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据