4.5 Article

Construction of cardiovascular information extraction corpus based on electronic medical records

期刊

MATHEMATICAL BIOSCIENCES AND ENGINEERING
卷 20, 期 7, 页码 13379-13397

出版社

AMER INST MATHEMATICAL SCIENCES-AIMS
DOI: 10.3934/mbe.2023596

关键词

cardiovascular disease; corpus construction; electronic medical record

向作者/读者索取更多资源

Cardiovascular disease has a significant impact, and knowledge-based research is necessary. However, there is limited research on corpus construction for this disease, hindering further knowledge-based research. This study collected electronic medical record data and developed a standard for labeling cardiovascular electronic medical record entities and entity relations. A cardiovascular electronic medical record entity and entity relationship labeling corpus (CVDEMRC) was constructed with good consistency results, providing a database for information extraction research related to cardiovascular diseases.
Cardiovascular disease has a significant impact on both society and patients, making it necessary to conduct knowledge-based research such as research that utilizes knowledge graphs and automated question answering. However, the existing research on corpus construction for cardiovascular disease is relatively limited, which has hindered further knowledge-based research on this disease. Electronic medical records contain patient data that span the entire diagnosis and treatment process and include a large amount of reliable medical information. Therefore, we collected electronic medical record data related to cardiovascular disease, combined the data with relevant work experience and developed a standard for labeling cardiovascular electronic medical record entities and entity relations. By building a sentence-level labeling result dictionary through the use of a rule-based semi-automatic method, a cardiovascular electronic medical record entity and entity relationship labeling corpus (CVDEMRC) was constructed. The CVDEMRC contains 7691 entities and 11,185 entity relation triples, and the results of consistency examination were 93.51% and 84.02% for entities and entity-relationship annotations, respectively, demonstrating good consistency results. The CVDEMRC constructed in this study is expected to provide a database for information extraction research related to cardiovascular diseases.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据