4.3 Article

McClintock: An Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data

期刊

G3-GENES GENOMES GENETICS
卷 7, 期 8, 页码 2763-2778

出版社

OXFORD UNIV PRESS INC
DOI: 10.1534/g3.117.043893

关键词

transposable elements; bioinformatics; genomics; yeast

资金

  1. Wellcome Trust [096602/B/11/Z]
  2. Human Frontier Science Program grant [RGY0093/2012]

向作者/读者索取更多资源

Transposable element (TE) insertions are among the most challenging types of variants to detect in genomic data because of their repetitive nature and complex mechanisms of replication. Nevertheless, the recent availability of large resequencing data sets has spurred the development of many new methods to detect TE insertions in whole-genome shotgun sequences. Here we report an integrated bioinformatics pipeline for the detection of TE insertions in whole-genome shotgun data, called McClintock (https://github.com/bergmanlab/mcclintock), which automatically runs and standardizes output for multiple TE detection methods. We demonstrate the utility of McClintock by evaluating six TE detection methods using simulated and real genome data from the model microbial eukaryote, Saccharomyces cerevisiae. We find substantial variation among McClintock component methods in their ability to detect nonreference TEs in the yeast genome, but show that nonreference TEs at nearly all biologically realistic locations can be detected in simulated data by combining multiple methods that use split-read and read-pair evidence. In general, our results reveal that split-read methods detect fewer nonreference TE insertions than read-pair methods, but generally have much higher positional accuracy. Analysis of a large sample of real yeast genomes reveals that most McClintock component methods can recover known aspects of TE biology in yeast such as the transpositional activity status of families, target preferences, and target site duplication structure, albeit with varying levels of accuracy. Our work provides a general framework for integrating and analyzing results from multiple TE detection methods, as well as useful guidance for researchers studying TEs in yeast resequencing data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据