4.7 Article

Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports

期刊

EUROPEAN RADIOLOGY
卷 -, 期 -, 页码 -

出版社

SPRINGER
DOI: 10.1007/s00330-023-10061-z

关键词

Natural language processing; Isocitrate dehydrogenase; Glioma

向作者/读者索取更多资源

The study evaluated the performance of NLP models in predicting IDH mutation status in diffuse glioma, with BERT GCN showing the highest performance in external validation. BERT GCN was found to have superior performance compared to other NLP models and human readers in predicting IDH mutation status.
ObjectivesTo evaluate the performance of natural language processing (NLP) models to predict isocitrate dehydrogenase (IDH) mutation status in diffuse glioma using routine MR radiology reports.Materials and methodsThis retrospective, multi-center study included consecutive patients with diffuse glioma with known IDH mutation status from May 2009 to November 2021 whose initial MR radiology report was available prior to pathologic diagnosis. Five NLP models (long short-term memory [LSTM], bidirectional LSTM, bidirectional encoder representations from transformers [BERT], BERT graph convolutional network [GCN], BioBERT) were trained, and area under the receiver operating characteristic curve (AUC) was assessed to validate prediction of IDH mutation status in the internal and external validation sets. The performance of the best performing NLP model was compared with that of the human readers.ResultsA total of 1427 patients (mean age & PLUSMN; standard deviation, 54 & PLUSMN; 15; 779 men, 54.6%) with 720 patients in the training set, 180 patients in the internal validation set, and 527 patients in the external validation set were included. In the external validation set, BERT GCN showed the highest performance (AUC 0.85, 95% CI 0.81-0.89) in predicting IDH mutation status, which was higher than LSTM (AUC 0.77, 95% CI 0.72-0.81; p = .003) and BioBERT (AUC 0.81, 95% CI 0.76-0.85; p = .03). This was higher than that of a neuroradiologist (AUC 0.80, 95% CI 0.76-0.84; p = .005) and a neurosurgeon (AUC 0.79, 95% CI 0.76-0.84; p = .04).ConclusionBERT GCN was externally validated to predict IDH mutation status in patients with diffuse glioma using routine MR radiology reports with superior or at least comparable performance to human reader.Clinical relevance statementNatural language processing may be used to extract relevant information from routine radiology reports to predict cancer genotype and provide prognostic information that may aid in guiding treatment strategy and enabling personalized medicine.Key Points & BULL; A transformer-based natural language processing (NLP) model predicted isocitrate dehydrogenase mutation status in diffuse glioma with an AUC of 0.85 in the external validation set.& BULL; The best NLP models were superior or at least comparable to human readers in both internal and external validation sets.& BULL; Transformer-based models showed higher performance than conventional NLP model such as long short-term memory.Key Points & BULL; A transformer-based natural language processing (NLP) model predicted isocitrate dehydrogenase mutation status in diffuse glioma with an AUC of 0.85 in the external validation set.& BULL; The best NLP models were superior or at least comparable to human readers in both internal and external validation sets.& BULL; Transformer-based models showed higher performance than conventional NLP model such as long short-term memory.Key Points & BULL; A transformer-based natural language processing (NLP) model predicted isocitrate dehydrogenase mutation status in diffuse glioma with an AUC of 0.85 in the external validation set.& BULL; The best NLP models were superior or at least comparable to human readers in both internal and external validation sets.& BULL; Transformer-based models showed higher performance than conventional NLP model such as long short-term memory.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据