4.7 Article

An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data

Journal

BMC GENOMICS
Volume 16, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s12864-015-1647-5

Keywords

k-mers; Phylogenomics; Homoplasy; Alignment-free; Assembly-free

Funding

  1. Science Foundation of the Chinese Academy of Sciences 135 program [XTBG-T01]
  2. HPC Center, Kunming Institute of Botany, CAS, China
  3. UW-Madison Graduate School
  4. Bascom-Plaenert fund
  5. US-NSF [DEB-0816613]
  6. Yunnan Provincial Government High Level Talent Introduction grant [09SK051B01]
  7. Chinese Academy of Sciences [151C53WJQNKXJ20110008]

Ask authors/readers for more resources

Background: Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging. Results: To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms. Conclusion: Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available