☆ 4.2 Article

Evaluating feature-selection stability in next-generation proteomics

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (2016)

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

卷 14, 期 5, 页码 -

出版社

IMPERIAL COLLEGE PRESS

DOI: 10.1142/S0219720016500293

关键词

Proteomics; networks; biostatistics; translational research

类别

Biochemical Research Methods Computer Science, Interdisciplinary Applications Mathematical & Computational Biology

资金

Tianjin University, China
Singapore Ministry of Education tier-2 grant [MOE2012-T2-1-061]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Identifying reproducible yet relevant features is a major challenge in biological research. This is well documented in genomics data. Using a proposed set of three reliability benchmarks, we find that this issue exists also in proteomics for commonly used feature-selection methods, e.g. t-test and recursive feature elimination. Moreover, due to high test variability, selecting the top proteins based on p-value ranks - even when restricted to high-abundance proteins - does not improve reproducibility. Statistical testing based on networks are believed to be more robust, but this does not always hold true: The commonly used hypergeometric enrichment that tests for enrichment of protein subnets performs abysmally due to its dependence on unstable protein pre-selection steps. We demonstrate here for the first time the utility of a novel suite of network-based algorithms called ranked-based network algorithms (RBNAs) on proteomics. These have originally been introduced and tested extensively on genomics data. We show here that they are highly stable, reproducible and select relevant features when applied to proteomics data. It is also evident from these results that use of statistical feature testing on protein expression data should be executed with due caution. Careless use of networks does not resolve poor-performance issues, and can even mislead. We recommend augmenting statistical feature-selection methods with concurrent analysis on stability and reproducibility to improve the quality of the selected features prior to experimental validation.

Evaluating feature-selection stability in next-generation proteomics

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

出版社

IMPERIAL COLLEGE PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Evaluating feature-selection stability in next-generation proteomics

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

出版社

IMPERIAL COLLEGE PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文