4.6 Article

Using Machine Learning Techniques and National Tuberculosis Surveillance Data to Predict Excess Growth in Genotyped Tuberculosis Clusters

Journal

AMERICAN JOURNAL OF EPIDEMIOLOGY
Volume 191, Issue 11, Pages 1936-1943

Publisher

OXFORD UNIV PRESS INC
DOI: 10.1093/aje/kwac117

Keywords

cluster growth; machine learning; surveillance data; tuberculosis

Funding

  1. CDC's Division of Tuberculosis Elimination

Ask authors/readers for more resources

This study demonstrates the use of surveillance data, statistical definitions, and machine learning to predict clusters of tuberculosis cases that are likely to grow and become outbreaks, providing an opportunity for intervention and prevention.
The early identification of clusters of persons with tuberculosis (TB) that will grow to become outbreaks creates an opportunity for intervention in preventing future TB cases. We used surveillance data (2009-2018) from the United States, statistically derived definitions of unexpected growth, and machine-learning techniques to predict which clusters of genotype-matched TB cases are most likely to continue accumulating cases above expected growth within a 1-year follow-up period. We developed a model to predict which clusters are likely to grow on a training and testing data set that was generalizable to a validation data set. Our model showed that characteristics of clusters were more important than the social, demographic, and clinical characteristics of the patients in those clusters. For instance, the time between cases before unexpected growth was identified as the most important of our predictors. A faster accumulation of cases increased the probability of excess growth being predicted during the follow-up period. We have demonstrated that combining the characteristics of clusters and cases with machine learning can add to existing tools to help prioritize which clusters may benefit most from public health interventions. For example, consideration of an entire cluster, not only an individual patient, may assist in interrupting ongoing transmission.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available