4.6 Article

Non-negative matrix factorization temporal topic models and clinical text data identify COVID-19 pandemic effects on primary healthcare and community health in Toronto, Canada

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 128, 期 -, 页码 -

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2022.104034

关键词

COVID-19; Electronic medical record; Text mining; Temporal topic modelling; Non-negative matrix factorization; Time series analysis

资金

  1. Canadian Institutes of Health Research (CIHR) [FDN 143303]

向作者/读者索取更多资源

This study uses non-negative matrix factorization to learn a temporal topic model that characterizes the diverse effects of the COVID-19 pandemic on the physical/mental/social health of residents in Toronto, Canada. Analyzing a large collection of primary care clinical notes, the study uncovers many pandemic-related effects, including direct effects on patient health and indirect effects on mental health, sleep, social dynamics, and healthcare utilization. The study also identifies changes in primary care practice patterns resulting from the pandemic, such as changes in electronic medical records and the adoption of telemedicine.
Objective: To demonstrate how non-negative matrix factorization can be used to learn a temporal topic model over a large collection of primary care clinical notes, characterizing diverse COVID-19 pandemic effects on the physical/mental/social health of residents of Toronto, Canada. Materials and Methods: The study employs a retrospective open cohort design, consisting of 382,666 primary care progress notes from 44,828 patients, 54 physicians, and 12 clinics collected 01/01/2017 through 31/12/2020. Non-negative matrix factorization uncovers a meaningful latent topical structure permeating the corpus of primary care notes. The learned latent topical basis is transformed into a multivariate time series data structure. Time series methods and plots showcase the evolution/dynamics of learned topics over the study period and allow the identification of COVID-19 pandemic effects. We perform several post-hoc checks of model robustness to increase trust that descriptive/unsupervised inferences are stable over hyper-parameter configurations and/or data perturbations. Results: Temporal topic modelling uncovers a myriad of pandemic-related effects from the expressive clinical text data. In terms of direct effects on patient-health, topics encoding respiratory disease symptoms display altered dynamics during the pandemic year. Further, the pandemic was associated with a multitude of indirect patient level effects on topical domains representing mental health, sleep, social and familial dynamics, measurement of vitals/labs, uptake of prevention/screening maneuvers, and referrals to medical specialists. Finally, topic models capture changes in primary care practice patterns resulting from the pandemic, including changes in EMR documentation strategies and the uptake of telemedicine. Conclusion: Temporal topic modelling applied to a large corpus of rich primary care clinical text data, can identify a meaningful topical/thematic summarization which can provide policymakers and public health stakeholders a passive, cost-effective, technology for understanding holistic impacts of the COVID-19 pandemic on the primary healthcare system and community/public-health.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据