☆ 4.5 Review

Statistical data processing in clinical proteomics

JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES (2008)

期刊

JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES

卷 866, 期 1-2, 页码 77-88

出版社

ELSEVIER

DOI: 10.1016/j.jchromb.2007.10.042

关键词

statistical validation; permutation test; classification; biomarker discovery; double cross-validation; feature selection; curse of dimensionality; multivariate data analysis; proteomics

类别

Biochemical Research Methods Chemistry, Analytical

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This review discusses data analysis strategies for the discovery of biomarkers in clinical proteomics. Proteomics studies produce large amounts of data, characterized by few samples of which many variables are measured. A wealth of classification methods exists for extracting information from the data. Feature selection plays an important role in reducing the dimensionality of the data prior to classification and in discovering biomarker leads. The question which classification strategy works best is yet unanswered. Validation is a crucial step for biomarker leads towards clinical use. Here we only discuss statistical validation, recognizing that biological and clinical validation is of utmost importance. First, there is the need for validated model selection to develop a generalized classifier that predicts new samples correctly. A cross-validation loop that is wrapped around the model development procedure assesses the performance using unseen data. The significance of the model should be tested; we use permutations of the data for comparison with uninformative data. This procedure also tests the correctness of the performance validation. Preferably, a new set of samples is measured to test the classifier and rule out results specific for a machine, analyst, laboratory or the first set of samples. This is not yet standard practice. We present a modular framework that combines feature selection, classification, biomarker discovery and statistical validation; these data analysis aspects are all discussed in this review. The feature selection, classification and biomarker discovery modules can be incorporated or omitted to the preference of the researcher. The validation modules, however, should not be optional. In each module, the researcher can select from a wide range of methods, since there is not one unique way that leads to the correct model and proper validation. We discuss many possibilities for feature selection, classification and biomarker discovery. For validation we advice a combination of cross-validation and permutation testing, a validation strategy supported in the literature. (C) 2007 Elsevier B.V. All rights reserved.

Statistical data processing in clinical proteomics

期刊

JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Statistical data processing in clinical proteomics

期刊

JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文