☆ 4.4 Review

Quality indices for topic model selection and evaluation: a literature review and case study

BMC MEDICAL INFORMATICS AND DECISION MAKING (2023)

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

卷 23, 期 1, 页码 -

出版社

BMC

DOI: 10.1186/s12911-023-02216-1

关键词

Non-negative matrix factorization; Topic model; Internal validation; Cross-validation; Stability analysis; Clinical text data; Electronic medical record

类别

Medical Informatics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study reviews several methods for assessing the quality of unsupervised topic models and discusses their advantages and disadvantages. By using different metrics and human judgement, it is found that different quality indices have different impacts on model selection.

Background Topic models are a class of unsupervised machine learning models, which facilitate summarization, browsing and retrieval from large unstructured document collections. This study reviews several methods for assessing the quality of unsupervised topic models estimated using non-negative matrix factorization. Techniques for topic model validation have been developed across disparate fields. We synthesize this literature, discuss the advantages and disadvantages of different techniques for topic model validation, and illustrate their usefulness for guiding model selection on a large clinical text corpus. Design, setting and data Using a retrospective cohort design, we curated a text corpus containing 382,666 clinical notes collected between 01/01/2017 through 12/31/2020 from primary care electronic medical records in Toronto Canada. Methods Several topic model quality metrics have been proposed to assess different aspects of model fit. We explored the following metrics: reconstruction error, topic coherence, rank biased overlap, Kendall's weighted tau, partition coefficient, partition entropy and the Xie-Beni statistic. Depending on context, cross-validation and/or bootstrap stability analysis were used to estimate these metrics on our corpus. Results Cross-validated reconstruction error favored large topic models (K >= 100 topics) on our corpus. Stability analysis using topic coherence and the Xie-Beni statistic also favored large models (K = 100 topics). Rank biased overlap and Kendall's weighted tau favored small models (K = 5 topics). Few model evaluation metrics suggested mid-sized topic models (25 <= K <= 75) as being optimal. However, human judgement suggested that mid-sized topic models produced expressive low- dimensional summarizations of the corpus. Conclusions Topic model quality indices are transparent quantitative tools for guiding model selection and evaluation. Our empirical illustration demonstrated that different topic model quality indices favor models of different complexity; and may not select models aligning with human judgment. This suggests that different metrics capture different aspects of model goodness of fit. A combination of topic model quality indices, coupled with human validation, may be useful in appraising unsupervised topic models.

Quality indices for topic model selection and evaluation: a literature review and case study

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

出版社

BMC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Quality indices for topic model selection and evaluation: a literature review and case study

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

出版社

BMC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文