4.5 Article

Complete vertebrate mitogenomes reveal widespread repeats and gene duplications

Journal

GENOME BIOLOGY
Volume 22, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s13059-021-02336-9

Keywords

Mitochondrial DNA; Vertebrate; Assembly; Long reads; Sequencing; Duplications; Repeats

Funding

  1. Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health
  2. Korea Health Technology R&D Project through KHIDI - Ministry of Health & Welfare, Republic of Korea [HI17C2098]
  3. Al-Gannas Qatari Society
  4. Cultural Village Foundation-Katara, Doha, State of Qatar
  5. Monash University Malaysia
  6. Rockefeller University start-up funds
  7. Howard Hughes Medical Institute
  8. Fondazione Cariplo [2018-2045]
  9. Italian Ministry of Education, University and Research (MIUR) [PRIN2017 20174BTC4R]
  10. Wellcome grant [WT207492]

Ask authors/readers for more resources

This study developed a fully automated pipeline that successfully assembled complete mitochondrial genomes of 100 vertebrate species, and found that tissue type and library size selection significantly impact mitochondrial genome sequencing and assembly. Comparison with reference mitochondrial genomes based on short-read sequencing revealed errors, missing sequences, and incomplete genes in the references.
Background Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100-300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions Our results indicate that even in the simple case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available