4.6 Article

Long-read sequencing reveals a 4.4kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1

Journal

PARASITES & VECTORS
Volume 12, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s13071-019-3492-x

Keywords

Echinococcus; Genotype G1; Complete mitochondrial (mt) genome; Repetitive DNA; PacBio sequencing

Funding

  1. Australian Research Council
  2. Australian National Health and Medical Research Council [GTN1105448]

Ask authors/readers for more resources

BackgroundEchinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided useful genetic markers to explore the nature and extent of genetic diversity within Echinococcus and have underpinned phylogenetic and population structure analyses of this genus. Our recent work indicated a sequence gap (>1kb) in the mt genomes of E. granulosus genotype G1, which could not be determined by PCR-based Sanger sequencing. The aim of the present study was to define the complete mt genome, irrespective of structural complexities, using a long-read sequencing method.MethodsWe extracted high molecular weight genomic DNA from protoscoleces from a single cyst of E. granulosus genotype G1 from a sheep from Australia using a conventional method and sequenced it using PacBio Sequel (long-read) technology, complemented by BGISEQ-500 short-read sequencing. Sequence data obtained were assembled using a recently-developed workflow.ResultsWe assembled a complete mt genome sequence of 17,675bp, which is >4kb larger than the complete mt genomes known for E. granulosus genotype G1. This assembly includes a previously-elusive tandem repeat region, which is 4417bp long and consists of ten near-identical 441-445bp repeat units, each harbouring a 184bp non-coding region and adjacent regions. We also identified a short non-coding region of 183bp, which includes an inverted repeat.ConclusionsWe report what we consider to be the first complete mt genome of E. granulosus genotype G1 and characterise all repeat regions in this genome. The numbers, sizes, sequences and functions of tandem repeat regions remain to be studied in different isolates of genotype G1 and in other genotypes and species. The discovery of such new' repeat elements in the mt genome of genotype G1 by PacBio sequencing raises a question about the completeness of some published genomes of taeniid cestodes assembled from conventional or short-read sequence datasets. This study shows that long-read sequencing readily overcomes the challenges of assembling repeat elements to achieve improved genomes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available