3.8 Article

A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification

Journal

JAMIA OPEN
Volume 4, Issue 3, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/jamiaopen/ooab070

Keywords

natural language processing; clinical decision support systems; artificial intelligence; information extraction; signs; and symptoms; follow-up studies

Funding

  1. Fairview Health Services
  2. National Institutes of Health's National Center for Advancing Translational Sciences [U01TR002062]
  3. National Institutes of Health's National Heart, Lung, and Blood Institute [T32HL07741]
  4. Agency for Healthcare Research and Quality (AHRQ) [R01HS026743]
  5. Patient-Centered Outcomes Research Institute (PCORI) [K12HS026379]
  6. University of Minnesota Office of Academic Clinical Affairs
  7. Division of Health Policy and Management, University of Minnesota School of Public Health
  8. University of Minnesota CTSA [UL1TR000114]

Ask authors/readers for more resources

The rule-based gazetteer developed in this study showed superior speed, resource utilization, and performance, providing an effective solution for real-time symptom identification and integration of unstructured data elements into clinical decision support systems. Fine-tuning lexical rules and running on multiple compute nodes were identified as opportunities to further enhance its performance.
Objective: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A potential solution to mitigate these issues is to use the rule-based gazetteer developed at our institution. Materials and Methods: Performance, resource utilization, and runtime of the rule-based gazetteer were compared with five annotation systems: BioMedICUS, cTAKES, MetaMap, CLAMP, and MedTagger. Results: This rule-based gazetteer was the fastest, had a low resource footprint, and similar performance for weighted microaverage and macroaverage measures of precision, recall, and f1-score compared to other annotation systems. Discussion: Opportunities to increase its performance include fine-tuning lexical rules for symptom identification. Additionally, it could run on multiple compute nodes for faster runtime. Conclusion: This rule-based gazetteer overcame key technical limitations facilitating real-time symptomatology identification for COVID-19 and integration of unstructured data elements into our CDS. It is ideal for large-scale deployment across a wide variety of healthcare settings for surveillance of acute COVID-19 symptoms for integration into prognostic modeling. Such a system is currently being leveraged for monitoring of postacute sequelae of COVID-19 (PASC) progression in COVID-19 survivors. This study conducted the first in-depth analysis and developed a rule-based gazetteer for COVID-19 symptom extraction with the following key features: low processor and memory utilization, faster runtime, and similar weighted microaverage and macroaverage measures for precision, recall, and f1-score compared to industry-standard annotation systems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available