☆ 4.7 Article

Domain knowledge-enhanced variable selection for biomedical data analysis

INFORMATION SCIENCES (2022)

期刊

INFORMATION SCIENCES

卷 606, 期 -, 页码 469-488

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2022.05.076

关键词

Biomedical data mining; Variable selection; Domain knowledge; Insufficient samples

类别

Computer Science, Information Systems

资金

National Key Research and Development Program of China [2021ZD0111700]
National Nature Science Foundation of China [62137002, 62176245, 62006065]
Key Research and Development Program of Anhui Province [202104a05020011]
Key Science and Technology Special Project of Anhui Province [202103a07020002]
Artificial Intelligence Social Experiment of Anhui Provincial Health Commission
Fundamental Research Funds for the Central Universities

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Machine learning has been successful in analyzing biomedical data. However, the lack of samples in the biomedical field poses challenges for traditional variable selection algorithms. This paper proposes a method that utilizes domain knowledge to overcome this issue and demonstrates its effectiveness.

Machine learning has achieved impressive results in biomedical data analysis. To cope with high-dimensional data, variable selection is proposed to identify patterns in the feature space and select informative and predictive variables. However, due to the scarce cases and expensive sampling costs in the biomedical area, the lack of samples has become the main obstacle to the performance improvement of traditional variable selection algorithms in the biomedical data. In this paper, we solve this problem by the feat of domain knowledge, which seems to be a unique method in the biomedicine area due to the abundant and reliable domain knowledge in it. Nevertheless, the empirical study demonstrated that the brute-force implantation of domain knowledge may be counterproductive, especially when unfaithful knowledge exists. To elegantly incorporate domain knowledge into the variable selection framework, this paper starts from the joint likelihood function of the discriminative model and derives the extended form of prior knowledge term based on the existing variable selection framework. Based on this, a novel method is presented. We prove and substantiate with both synthetic and real-world biomedical data that, the proposal could effectively utilize the information from domain knowledge to assist variable selection and simultaneously reject incorrect knowledge. (C) 2022 Published by Elsevier Inc.

Domain knowledge-enhanced variable selection for biomedical data analysis

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Domain knowledge-enhanced variable selection for biomedical data analysis

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文