4.5 Article

Evaluation of an ensemble-based distance statistic for clustering MLST datasets using epidemiologically defined clusters of cyclosporiasis

Journal

EPIDEMIOLOGY AND INFECTION
Volume 148, Issue -, Pages -

Publisher

CAMBRIDGE UNIV PRESS
DOI: 10.1017/S0950268820001697

Keywords

Cyclospora cayetanensis; clustering; cyclosporiasis; deep sequencing; distance-statistic; genotype; genotyping; machine learning; MLST

Funding

  1. Centers for Disease Control and Prevention Office of Advanced Molecular Detection

Ask authors/readers for more resources

Outbreaks of cyclosporiasis, a food-borne illness caused by the coccidian parasiteCyclospora cayetanensishave increased in the USA in recent years, with approximately 2300 laboratory-confirmed cases reported in 2018. Genotyping tools are needed to inform epidemiological investigations, yet genotypingCyclosporahas proven challenging due to its sexual reproductive cycle which produces complex infections characterized by high genetic heterogeneity. We used targeted amplicon deep sequencing and a recently described ensemble-based distance statistic that accommodates heterogeneous (mixed) genotypes and specimens with partial genotyping data, to genotype and cluster 648C. cayetanensissamples submitted to CDC in 2018. The performance of the ensemble was assessed by comparing ensemble-identified genetic clusters to analogous clusters identified independently based on common food exposures. Using these epidemiologic clusters as a gold standard, the ensemble facilitated genetic clustering with 93.8% sensitivity and 99.7% specificity. Hence, we anticipate that this procedure will greatly complement epidemiologic investigations of cyclosporiasis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available