☆ 4.7 Article

Combining transformer-based model and GCN to predict ICD codes from clinical records

KNOWLEDGE-BASED SYSTEMS (2023)

Journal

KNOWLEDGE-BASED SYSTEMS

Volume 282, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.knosys.2023.111113

Keywords

Automatic ICD coding; Multi-label text classification; Graph Convolutional Networks; Transformer-based models; Pseudo labeling attention

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Automatic International Classification of Diseases (ICD) coding is a method of classifying diseases through a computer program based on etiology and clinical presentation rules. This paper proposes an approach called TF-GCN to improve the accuracy of automatic ICD coding by feature extraction and relationship analysis.

Automatic International Classification of Diseases (ICD) coding is a method of automatically classifying diseases through a computer program based on rules of etiology and clinical presentation, and representing them through codes, which are widely used to assist in medical reimbursement and reporting of patient health status. With the application of machine learning and deep learning, the accuracy of automatic ICD coding methods has improved considerably. However, this has been accompanied by problems such as insufficient pre-training of text in the models and increased computational complexity along with improved prediction accuracy. In this work we propose an approach called TF-GCN to counter this problem. Firstly, a more accurate and concise feature representation is obtained by feature extraction of both clinical records and ICD codes through the transformer-based model. Secondly, the node features, document features, and relationships between them in the obtained clinical records are input to the GCN for training. Next, a pseudo labeling attention mechanism is added to eliminate the noise generated in the feature extraction process. Finally, the features of the clinical records are compared with the features of the ICD codes for similarity to obtain the classification results. This can not only reduce computational redundancy, but also obtain more accurate classification features. In the real-world MIMIC-III dataset, we compare the proposed algorithm with 11 automatic ICD coding methods to validate the performance of TF-GCN. According to experimental findings, our suggested strategy outperforms the standard evaluation metrics Mif (0.589), MiAUC (0.989), and P@8 (0.758).

Combining transformer-based model and GCN to predict ICD codes from clinical records

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Combining transformer-based model and GCN to predict ICD codes from clinical records

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper