4.6 Article

Repurposing the Clinical Record: Can an Existing Natural Language Processing System De-identify Clinical Notes?

Journal

Publisher

OXFORD UNIV PRESS
DOI: 10.1197/jamia.M2862

Keywords

-

Funding

  1. NLM NIH HHS [R01 LM007659, R01 LM006910, R01 LM06910, R01 LM008635] Funding Source: Medline
  2. PHITPO CDC HHS [P01 HK000029] Funding Source: Medline
  3. NATIONAL LIBRARY OF MEDICINE [R01LM007659, R01LM006910, R01LM008635] Funding Source: NIH RePORTER

Ask authors/readers for more resources

Electronic clinical documentation can be useful for activities such as public health surveillance, quality improvement, and research, but existing methods of de-identification may not provide sufficient protection of patient data. The general-purpose natural language processor MedLEE retains medical concepts while excluding the remaining text so, in addition to processing text into structured data, it may be able provide a secondary benefit of de-identification. Without modifying the system, the authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output. Of 809 instances of PHI, 26 (3.2%) were detected in output as a result of processing and identification errors. However, PHI in the output was highly transformed, much appearing as normalized terms for medical concepts, potentially making re-identification more difficult. The MedLEE processor may be a good enhancement to other de-identification systems, both removing PHI and providing coded data from clinical text.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available