4.5 Article

Predicting suicide attempt or suicide death following a visit to psychiatric specialty care: A machine learning study using Swedish national registry data

Journal

PLOS MEDICINE
Volume 17, Issue 11, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pmed.1003416

Keywords

-

Funding

  1. Wellcome Trust [202836/Z/16/Z]
  2. European Union's Seventh Framework Programme for research, technological development and demonstration [602805]
  3. European Union's Horizon 2020 research and innovation programme [667302]

Ask authors/readers for more resources

Background Suicide is a major public health concern globally. Accurately predicting suicidal behavior remains challenging. This study aimed to use machine learning approaches to examine the potential of the Swedish national registry data for prediction of suicidal behavior. Methods and findings The study sample consisted of 541,300 inpatient and outpatient visits by 126,205 Sweden-born patients (54% female and 46% male) aged 18 to 39 (mean age at the visit: 27.3) years to psychiatric specialty care in Sweden between January 1, 2011 and December 31, 2012. The most common psychiatric diagnoses at the visit were anxiety disorders (20.0%), major depressive disorder (16.9%), and substance use disorders (13.6%). A total of 425 candidate predictors covering demographic characteristics, socioeconomic status (SES), electronic medical records, criminality, as well as family history of disease and crime were extracted from the Swedish registry data. The sample was randomly split into an 80% training set containing 433,024 visits and a 20% test set containing 108,276 visits. Models were trained separately for suicide attempt/death within 90 and 30 days following a visit using multiple machine learning algorithms. Model discrimination and calibration were both evaluated. Among all eligible visits, 3.5% (18,682) were followed by a suicide attempt/death within 90 days and 1.7% (9,099) within 30 days. The final models were based on ensemble learning that combined predictions from elastic net penalized logistic regression, random forest, gradient boosting, and a neural network. The area under the receiver operating characteristic (ROC) curves (AUCs) on the test set were 0.88 (95% confidence interval [CI] = 0.87-0.89) and 0.89 (95% CI = 0.88-0.90) for the outcome within 90 days and 30 days, respectively, both being significantly better than chance (i.e., AUC = 0.50) (p < 0.01). Sensitivity, specificity, and predictive values were reported at different risk thresholds. A limitation of our study is that our models have not yet been externally validated, and thus, the generalizability of the models to other populations remains unknown. Conclusions By combining the ensemble method of multiple machine learning algorithms and high-quality data solely from the Swedish registers, we developed prognostic models to predict short-term suicide attempt/death with good discrimination and calibration. Whether novel predictors can improve predictive performance requires further investigation. Author summary Why was this study done? Suicidal behavior is overrepresented in people with mental illness and contributes to the substantial public health burden of psychiatric conditions. Accurately predicting suicidal behavior has long been challenging. The potential of applying machine learning to linked national datasets to predict suicidal behavior remains unknown. What did the researchers do and find? We identified a sample of 541,300 inpatient and outpatient visits to psychiatric specialty care in Sweden during 2011 and 2012. The sample was then divided into a training dataset and a test dataset. We first trained prediction models separately for suicide attempt/death within 90 days and 30 days following a visit to psychiatric specialty care, using 4 different machine learning algorithms. We then used an ensemble method to combine the performance of the trained models with the intention to achieve an overall performance superior than each individual model. The final model based on the ensemble method achieved the best predictive performance. This model was applied to test dataset and showed good model discrimination and calibration for both the 90-day and 30-day outcomes. What do these findings mean? Our findings suggest that combining machine learning with registry data has the potential to accurately predict short-term suicidal behavior. An approach combining 4 machine learning methods showed an overall predictive performance slightly better than each individual model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available