☆ 4.7 Article

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports

BIOINFORMATICS (2020)

期刊

BIOINFORMATICS

卷 36, 期 21, 页码 5255-5261

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/btaa668

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Computer Science, Interdisciplinary Applications Mathematical & Computational Biology Statistics & Probability

资金

Charite - Universitatsmedizin Berlin
Berlin Institute of Health
Deutsche Forschungsgemeinschaft (DFG) [SFB 1340/1 2018, 5943/31/41/91]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Motivation: The development of deep, bidirectional transformers such as Bidirectional Encoder Representations from Transformers (BERT) led to an outperformance of several Natural Language Processing (NLP) benchmarks. Especially in radiology, large amounts of free-text data are generated in daily clinical workflow. These report texts could be of particular use for the generation of labels in machine learning, especially for image classification. However, as report texts are mostly unstructured, advanced NLP methods are needed to enable accurate text classification. While neural networks can be used for this purpose, they must first be trained on large amounts of manually labelled data to achieve good results. In contrast, BERT models can be pre-trained on unlabelled data and then only require fine tuning on a small amount of manually labelled data to achieve even better results. Results: Using BERT to identify the most important findings in intensive care chest radiograph reports, we achieve areas under the receiver operation characteristics curve of 0.98 for congestion, 0.97 for effusion, 0.97 for consolidation and 0.99 for pneumothorax, surpassing the accuracy of previous approaches with comparatively little annotation effort. Our approach could therefore help to improve information extraction from free-text medical reports.

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文