☆ 4.6 Article

A corpus-driven standardization framework for encoding clinical problems with HL7 FHIR

JOURNAL OF BIOMEDICAL INFORMATICS (2020)

期刊

JOURNAL OF BIOMEDICAL INFORMATICS

卷 110, 期 -, 页码 -

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.jbi.2020.103541

关键词

Health Information Interoperability (D000073892); Deep Learning (D000077321); Systematized Nomenclature of Medicine (D039061); Semantics (D012660); Natural Language Processing (D009323)

类别

Computer Science, Interdisciplinary Applications Medical Informatics

资金

[NCATS U01TR02062]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Free-text problem descriptions are brief explanations of patient diagnoses and issues, commonly found in problem lists and other prominent areas of the medical record. These compact representations often express complex and nuanced medical conditions, making their semantics challenging to fully capture and standardize. In this study, we describe a framework for transforming free-text problem descriptions into standardized Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) models. This approach leverages a combination of domain-specific dependency parsers, Bidirectional Encoder Representations from Transformers (BERT) natural language models, and cui2vec Unified Medical Language System (UMLS) concept vectors to align extracted concepts from free-text problem descriptions into structured FHIR models. A neural network classification model is used to classify thirteen relationship types between concepts, facilitating mapping to the FHIR Condition resource. We use data programming, a weak supervision approach, to eliminate the need for a manually annotated training corpus. Shapley values, a mechanism to quantify contribution, are used to interpret the impact of model features. We found that our methods identified the focus concept, or primary clinical concern of the problem description, with an F-1 score of 0.95. Relationships from the focus to other modifying concepts were extracted with an F-1 score of 0.90. When classifying relationships, our model achieved a 0.89 weighted average F-1 score, enabling accurate mapping of attributes into HL7 FHIR models. We also found that the BERT input representation predominantly contributed to the classifier decision as shown by the Shapley values analysis.

A corpus-driven standardization framework for encoding clinical problems with HL7 FHIR

期刊

JOURNAL OF BIOMEDICAL INFORMATICS

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A corpus-driven standardization framework for encoding clinical problems with HL7 FHIR

期刊

JOURNAL OF BIOMEDICAL INFORMATICS

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文