期刊
GENOME
卷 60, 期 9, 页码 743-755出版社
CANADIAN SCIENCE PUBLISHING
DOI: 10.1139/gen-2016-0202
关键词
de novo clustering; GATK; phylogenetics; PyRAD; restriction-site associated DNA sequencing; variant discovery
资金
- National Science Foundation [DEB-1146488, DEB-1146102, IOS-1444611]
The emergence of next generation sequencing has increased by several orders of magnitude the amount of data available for phylogenetics. Reduced representation approaches, such as restriction-sited associated DNA sequencing (RADseq), have proven useful for phylogenetic studies of non-model species at a wide range of phylogenetic depths. However, analysis of these datasets is not uniform and we know little about the potential benefits and drawbacks of de novo assembly versus assembly by mapping to a reference genome. Using RADseq data for 83 oak samples representing 16 taxa, we identified variants via three pipelines: mapping sequence reads to a recently published draft genome of Quercus lobata, and de novo assembly under two sets of locus filters. For each pipeline, we inferred the maximum likelihood phylogeny. All pipelines produced similar trees, with minor shifts in relationships within well-supported clades, despite the fact that they yielded different numbers of loci (68 000 - 111 000 loci) and different degrees of overlap with the reference genome. We conclude that both the reference-aligned and de novo assembly pipelines yield reliable results, and that advantages and disadvantages of these approaches pertain mainly to downstream uses of RADseq data, not to phylogenetic inference per se.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据