☆ 4.4 Article

Normalization and integration of large-scale metabolomics data using support vector regression

METABOLOMICS (2016)

Journal

METABOLOMICS

Volume 12, Issue 5, Pages -

Publisher

SPRINGER

DOI: 10.1007/s11306-016-1026-5

Keywords

Metabolomics; Data normalization; Data integration; Support vector regression; Quality control

Funding

Interdisciplinary Research Center on Biology and Chemistry (IRCBC)
Chinese Academy of Sciences (CAS)
National Natural Science Foundation of China [21575151, 81573246]
Thousand Youth Talents Program (The Recruitment Program of Global Youth Experts from Chinese government)
Agilent Technologies Thought Leader Award

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Introduction Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra-and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies. Objectives We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses. Methods We developed a machine learning algorithm-based method, support vector regression (SVR), for largescale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization. Results After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected. Conclusion SVR normalization can effectively remove the unwanted intra-and inter-batch variations, and is much better than other common normalization methods.

Normalization and integration of large-scale metabolomics data using support vector regression

Journal

METABOLOMICS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Normalization and integration of large-scale metabolomics data using support vector regression

Journal

METABOLOMICS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper