4.5 Article

Leveraging Natural Language Processing to Improve Electronic Health Record Suicide Risk Prediction for Veterans Health Administration Users

Journal

JOURNAL OF CLINICAL PSYCHIATRY
Volume 84, Issue 4, Pages -

Publisher

PHYSICIANS POSTGRADUATE PRESS
DOI: 10.4088/JCP.22m14568

Keywords

-

Ask authors/readers for more resources

This study examined the benefits of including unstructured electronic health record (EHR) data in predicting suicide risk. The results showed that predictive models supplemented with natural language processing (NLP) provided considerable improvement in predictive accuracy compared to conventional structured EHR models.
Background: Suicide risk prediction models frequently rely on structured electronic health record (EHR) data, including patient demographics and health care usage variables. Unstructured EHR data, such as clinical notes, may improve predictive accuracy by allowing access to detailed information that does not exist in structured data fields. To assess comparative benefits of including unstructured data, we developed a large case-control dataset matched on a state-of-the-art structured EHR suicide risk algorithm, utilized natural language processing (NLP) to derive a clinical note predictive model, and evaluated to what extent this model provided predictive accuracy over and above existing predictive thresholds. Methods: We developed a matched case-control sample of Veterans Health Administration (VHA) patients in 2017 and 2018. Each case (all patients that died by suicide in that interval, n = 4,584) was matched with 5 controls (patients who remained alive during treatment year) who shared the same suicide risk percentile. All sample EHR notes were selected and abstracted using NLP methods. We applied machine-learning classification algorithms to NLP output to develop predictive models. We calculated area under the curve (AUC) and suicide risk concentration to evaluate predictive accuracy overall and for high-risk patients. Results: The best performing NLP-derived models provided 19% overall additional predictive accuracy (AUC = 0.69; 95% CI, 0.67, 0.72) and 6-fold additional risk concentration for patients at the highest risk tier (top 0.1%), relative to the structured EHR model. Conclusions: The NLP-supplemented predictive models provided considerable benefit when compared to conventional structured EHR models. Results support future structured and unstructured EHR risk model integrations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available