☆ 3.8 Article

Impact of Missing Value Imputation on Classification for DNA Microarray Gene Expression Data-A Model-Based Study

EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY (2009)

期刊

EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY

卷 -, 期 1, 页码 -

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

DOI: 10.1155/2009/504069

关键词

类别

Mathematical & Computational Biology

资金

National Science Foundation, through NSF [CCF-0845407, CCF-0634794]
Partnership for Personalized Medicine

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Many missing-value (MV) imputation methods have been developed for microarray data, but only a few studies have investigated the relationship between MV imputation and classification accuracy. Furthermore, these studies are problematic in fundamental steps such as MV generation and classifier error estimation. In this work, we carry out a model-based study that addresses some of the issues in previous studies. Six popular imputation algorithms, two feature selection methods, and three classification rules are considered. The results suggest that it is beneficial to apply MV imputation when the noise level is high, variance is small, or gene-cluster correlation is strong, under small to moderate MV rates. In these cases, if data quality metrics are available, then it may be helpful to consider the data point with poor quality as missing and apply one of the most robust imputation algorithms to estimate the true signal based on the available high-quality data points. However, at large MV rates, we conclude that imputation methods are not recommended. Regarding the MV rate, our results indicate the presence of a peaking phenomenon: performance of imputation methods actually improves initially as the MV rate increases, but after an optimum point, performance quickly deteriorates with increasing MV rates. Copyright (C) 2009 Youting Sun et al.

Impact of Missing Value Imputation on Classification for DNA Microarray Gene Expression Data-A Model-Based Study

期刊

EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Impact of Missing Value Imputation on Classification for DNA Microarray Gene Expression Data-A Model-Based Study

期刊

EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文