☆ 4.4 Article

Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling

METABOLOMICS (2016)

期刊

METABOLOMICS

卷 12, 期 5, 页码 -

出版社

SPRINGER

DOI: 10.1007/s11306-016-1030-9

关键词

UHPLC-MS; Metabolomics; Random forest; KNN; PQN normalisation; Glog transformation

类别

Endocrinology & Metabolism

资金

Wellcome Trust ISSF award at the University of Birmingham
UK Natural Environment Research Council's (NERC) Biomolecular Analysis Facility (NBAF) at the University of Birmingham [R8-H10-61]
MRC
NERC
Thermo Scientific [NE/J017442/1]
MRC [MR/M009157/1] Funding Source: UKRI

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Introduction The generic metabolomics data processing workflow is constructed with a serial set of processes including peak picking, quality assurance, normalisation, missing value imputation, transformation and scaling. The combination of these processes should present the experimental data in an appropriate structure so to identify the biological changes in a valid and robust manner. Objectives Currently, different researchers apply different data processing methods and no assessment of the permutations applied to UHPLC-MS datasets has been published. Here we wish to define the most appropriate data processing workflow. Methods We assess the influence of normalisation, missing value imputation, transformation and scaling methods on univariate and multivariate analysis of UHPLC-MS datasets acquired for different mammalian samples. Results Our studies have shown that once data are filtered, missing values are not correlated with m/z, retention time or response. Following an exhaustive evaluation, we recommend PQN normalisation with no missing value imputation and no transformation or scaling for univariate analysis. For PCA we recommend applying PQN normalisation with Random Forest missing value imputation, glog transformation and no scaling method. For PLS-DA we recommend PQN normalisation, KNN as the missing value imputation method, generalised logarithm transformation and no scaling. These recommendations are based on searching for the biologically important metabolite features independent of their measured abundance. Conclusion The appropriate choice of normalisation, missing value imputation, transformation and scaling methods differs depending on the data analysis method and the choice of method is essential to maximise the biological derivations from UHPLC-MS datasets.

Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling

期刊

METABOLOMICS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling

期刊

METABOLOMICS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文