Journal
KNOWLEDGE-BASED SYSTEMS
Volume 282, Issue -, Pages -Publisher
ELSEVIER
DOI: 10.1016/j.knosys.2023.111113
Keywords
Automatic ICD coding; Multi-label text classification; Graph Convolutional Networks; Transformer-based models; Pseudo labeling attention
Categories
Ask authors/readers for more resources
Automatic International Classification of Diseases (ICD) coding is a method of classifying diseases through a computer program based on etiology and clinical presentation rules. This paper proposes an approach called TF-GCN to improve the accuracy of automatic ICD coding by feature extraction and relationship analysis.
Automatic International Classification of Diseases (ICD) coding is a method of automatically classifying diseases through a computer program based on rules of etiology and clinical presentation, and representing them through codes, which are widely used to assist in medical reimbursement and reporting of patient health status. With the application of machine learning and deep learning, the accuracy of automatic ICD coding methods has improved considerably. However, this has been accompanied by problems such as insufficient pre-training of text in the models and increased computational complexity along with improved prediction accuracy. In this work we propose an approach called TF-GCN to counter this problem. Firstly, a more accurate and concise feature representation is obtained by feature extraction of both clinical records and ICD codes through the transformer-based model. Secondly, the node features, document features, and relationships between them in the obtained clinical records are input to the GCN for training. Next, a pseudo labeling attention mechanism is added to eliminate the noise generated in the feature extraction process. Finally, the features of the clinical records are compared with the features of the ICD codes for similarity to obtain the classification results. This can not only reduce computational redundancy, but also obtain more accurate classification features. In the real-world MIMIC-III dataset, we compare the proposed algorithm with 11 automatic ICD coding methods to validate the performance of TF-GCN. According to experimental findings, our suggested strategy outperforms the standard evaluation metrics Mif (0.589), MiAUC (0.989), and P@8 (0.758).
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available