4.7 Article

Phylogeographic model selection using convolutional neural networks

Journal

MOLECULAR ECOLOGY RESOURCES
Volume 21, Issue 8, Pages 2661-2675

Publisher

WILEY
DOI: 10.1111/1755-0998.13427

Keywords

convolutional neural networks; deep learning; machine learning; Norops spp; phylogeography

Funding

  1. Conselho Nacional de Desenvolvimento Cientifico e Tecnologico [305535/2017-0]
  2. National Science Foundation [DBI 1661029, DEB 1831319]
  3. Ohio Supercomputer Center [PAA0202]
  4. USAID's PEER program [AID-OAA-A-11-00012]
  5. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior [88881.170016/2018]

Ask authors/readers for more resources

The discipline of phylogeography has rapidly advanced in utilizing a wide range of analytical tools for analyzing large genomic data sets. This study demonstrates the effectiveness of convolutional neural networks (CNNs) in accurately assessing demographic models in South American lizards, with a model accuracy exceeding 98% for all lineages. This highlights the potential of CNNs as a valuable addition to the phylogeographer's toolkit.
The discipline of phylogeography has evolved rapidly in terms of the analytical toolkit used to analyse large genomic data sets. Despite substantial advances, analytical tools that could potentially address the challenges posed by increased model complexity have not been fully explored. For example, deep learning techniques are underutilized for phylogeographic model selection. In non-model organisms, the lack of information about their ecology and evolution can lead to uncertainty about which demographic models are appropriate. Here, we assess the utility of convolutional neural networks (CNNs) for assessing demographic models in South American lizards in the genus Norops. Three demographic scenarios (constant, expansion, and bottleneck) were considered for each of four inferred population-level lineages, and we found that the overall model accuracy was higher than 98% for all lineages. We then evaluated a set of 26 models that accounted for evolutionary relationships, gene flow, and changes in effective population size among the four lineages, identifying a single model with an estimated overall accuracy of 87% when using CNNs. The inferred demography of the lizard system suggests that gene flow between non-sister populations and changes in effective population sizes through time, probably in response to Pleistocene climatic oscillations, have shaped genetic diversity in this system. Approximate Bayesian computation (ABC) was applied to provide a comparison to the performance of CNNs. ABC was unable to identify a single model among the larger set of 26 models in the subsequent analysis. Our results demonstrate that CNNs can be easily and usefully incorporated into the phylogeographer's toolkit.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available