4.8 Article

Comprehensive identification of transposable element insertions using multiple sequencing technologies

Journal

NATURE COMMUNICATIONS
Volume 12, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41467-021-24041-8

Keywords

-

Funding

  1. National Institute of Mental Health [U01MH106883]
  2. National Cancer Institute [R03CA249364]
  3. National Library of Medicine [T15LM007092]

Ask authors/readers for more resources

Identification of transposable element (TE) insertions from whole genome sequencing data remains challenging. However, the xTea tool developed by the authors provides a comprehensive solution for both short-read and long-read sequencing data, outperforming other methods and enabling creation of a catalogue of TE insertions.
Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea. Identification of transposable element (TE) insertions from whole genome sequencing data remains challenging. Here the authors developed a comprehensive TE insertion detection algorithm xTea that can be applied to both short-read and long-read sequencing data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available