4.5 Article

A De-identification Method for Bilingual Clinical Texts of Various Note Types

期刊

JOURNAL OF KOREAN MEDICAL SCIENCE
卷 30, 期 1, 页码 7-15

出版社

KOREAN ACAD MEDICAL SCIENCES
DOI: 10.3346/jkms.2015.30.1.7

关键词

De-identification; Anonymization; Clinical Text; Bilingual Text; Patient Privacy; Medical Informatics; Text Mining

资金

  1. Asan Institute for Life Sciences, Seoul, Korea [2013-7205]

向作者/读者索取更多资源

De-identification of personal health information is essential in order not to require written patient informed consent. Previous de-identification methods were proposed using natural language processing technology in order to remove the identifiers in clinical narrative text, although these methods only focused on narrative text written in English. In this study, we propose a regular expression-based de-identification method used to address bilingual clinical records written in Korean and English. To develop and validate regular expression rules, we obtained training and validation datasets composed of 6,039 clinical notes of 20 types and 5,000 notes of 33 types, respectively. Fifteen regular expression rules were constructed using the development dataset and those rules achieved 99.87% precision and 96.25% recall for the validation dataset. Our de-identification method successfully removed the identifiers in diverse types of bilingual clinical narrative texts. This method will thus assist physicians to more easily perform retrospective research.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据