4.4 Article Proceedings Paper

Development and Assessment of a Machine Learning Model to Help Predict Survival Among Patients With Oral Squamous Cell Carcinoma

Journal

JAMA OTOLARYNGOLOGY-HEAD & NECK SURGERY
Volume 145, Issue 12, Pages 1115-1120

Publisher

AMER MEDICAL ASSOC
DOI: 10.1001/jamaoto.2019.0981

Keywords

-

Ask authors/readers for more resources

Importance Predicting survival of oral squamous cell carcinoma through the use of prediction modeling has been underused, and the development of prediction models would augment clinicians' ability to provide absolute risk estimates for individual patients. Objectives To develop a prediction model using machine learning for 5-year overall survival among patients with oral squamous cell carcinoma and compare this model with a prediction model created from the TNM (Tumor, Node, Metastasis) clinical and pathologic stage. Design, Setting, and Participants A retrospective cohort study was conducted of 33 & x202f;065 patients with oral squamous cell carcinoma from the National Cancer Data Base between January 1, 2004, and December 31, 2011. Patients were excluded if the treatment was considered palliative, staging demonstrated T0 or Tis, or survival or staging data were missing. Patient, tumor, treatment, and outcome information were obtained from the National Cancer Data Base. The data were split into a distribution of 80% for training and 20% for testing. The model was created using 2-class decision forest architecture. Permutation feature importance scores were used to determine the variables that were used in the model's prediction and their order of significance. Statistical analysis was conducted from August 1, 2018, to January 10, 2019. Main Outcomes and Measures Ability to predict 5-year overall survival assessed through area under the curve, accuracy, precision, and recall. Results Among the 33 & x202f;065 patients in the study, the mean (SD) age was 64.6 (14.0) years, 19 & x202f;791 were men (59.9%), 13 274 were women (40.1%), and 29 & x202f;783 (90.1%) were white. At 60 months, there were 16 & x202f;745 deaths (50.6%). The median time of follow-up was 56.8 months (range, 0-155.6 months). Age, pathologic T stage, positive margins at the time of surgery, lymph node size, and institutional identification were identified among the most significant variables. The calculated area under the curve for this machine learning model was 0.80 (95% CI, 0.79-0.81), accuracy was 71%, precision was 71%, and recall was 68%. In comparison, the calculated area under the curve of the TNM staging system was 0.68 (95% CI, 0.67-0.70), accuracy was 65%, precision was 69%, and recall was 52%. Conclusions and Relevance Using machine learning algorithms, a prediction model was created based on patient social, demographic, clinical, and pathologic features. The developed prediction model proved to be better than a prediction model that exclusively used TNM pathologic and clinical stage according to all performance metrics. This study highlights the role that machine learning may play in individual patient risk estimation in the era of big data. This cohort study describes a model using machine learning to help predict 5-year overall survival among patients with oral squamous cell carcinoma (OSCC) and compares this model with a prediction model created from the TNM (Tumor, Node, Metastasis) clinical and pathologic stage. Question How can machine learning be used to further our ability to create prediction models for survival of oral cancer? Findings In this cohort study of more than 30 & x202f;000 patients, a prediction model using a variety of patients, tumors, treatment facilities, and treatment types predicted 5-year overall survival with an accuracy of 71%, precision of 71%, and recall of 68%. Meaning Novel machine learning forms of analysis may help in the creation of prediction models using large data registries, and the inclusion of several aspects of the health status of a patient with cancer can more accurately predict survival.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available