4.5 Article

Clustering by periodontitis-associated factors-A novel application to NHANES data

Journal

JOURNAL OF PERIODONTOLOGY
Volume 92, Issue 8, Pages 1136-1150

Publisher

WILEY
DOI: 10.1002/JPER.20-0489

Keywords

chronic periodontitis; cluster analysis; dental health surveys; knowledge discovery; patient reported outcome measures

Ask authors/readers for more resources

Unsupervised clustering method was employed to identify unique subgroups at high risk for periodontitis using NHANES data. The study effectively detected characteristics statistically related to periodontitis status, highlighting subpopulations at risk without costly clinical examinations.
Background Unsupervised clustering is a method used to identify heterogeneity among groups and homogeneity within a group of patients. Without a prespecified outcome entry, the resulting model deciphers patterns that may not be disclosed using traditional methods. This is the first time such clustering analysis is applied in identifying unique subgroups at high risk for periodontitis in National Health and Nutrition Examination Surveys (NHANES 2009 to 2014 data sets using >500 variables. Methods Questionnaire, examination, and laboratory data (33 tables) for >1,000 variables were merged from 14,072 respondents who underwent clinical periodontal examination. Participants with >= 6 teeth and available data for all selected categories were included (N = 1,222). Data wrangling produced 519 variables. k-means/modes clustering (k = 2:14) was deployed. The optimal k-value was determined through the elbow method, formula = n-ary sumation (x(i)(2)) - (( n-ary sumation x(i))2 /n). The 5-cluster model showing the highest variability (63.08%) was selected. The 2012 Centers for Disease Control and Prevention/American Academy of Periodontology (AAP) and 2018 European Federation of Periodontology/AAP periodontitis case definitions were applied. Results Cluster 1 (n = 249) showed the highest prevalence of severe periodontitis (43%); 39% self-reported fair general health; 55% had household income <$35,000/year; and 48% were current smokers. Cluster 2 (n = 154) had one participant with periodontitis. Cluster 3 (n = 242) represented the greatest prevalence of moderate periodontitis (53%). In Cluster 4 (n = 35) only one participant had no periodontitis. Cluster 5 (n = 542) was the systemically healthiest with 77% having no/mild periodontitis. Conclusion Clustering of NHANES demographic, systemic health, and socioeconomic data effectively identifies characteristics that are statistically significantly related to periodontitis status and hence detects subpopulations at high risk for periodontitis without costly clinical examinations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available