4.7 Article

Workflow for Integrated Processing of Multicohort Untargeted 1H NMR Metabolomics Data in Large-Scale Metabolic Epidemiology

期刊

JOURNAL OF PROTEOME RESEARCH
卷 15, 期 12, 页码 4188-4194

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.6b00125

关键词

metabolomics; NMR; preprocessing; normalization; alignment; quality control; multicohort; epidemiology; large scale

资金

  1. EU [305422, 654241, 312941]
  2. UK MEDical Bioinformatics partnership programme (UK MED-BIO) - Medical Research Council (MRC)
  3. National Phenome Centre - MRC
  4. National Institute for Health Research (NIHR)
  5. NIHR - Health Protection Research Unit (HPRU) on Health Impacts of Environmental Hazards
  6. MRC-PHE
  7. NIHR Biomedical Research Centre at Imperial College Healthcare NHS Trust and Imperial College London
  8. NIHR-HPRU on Health Effects of Environmental Hazards
  9. MRC [MR/L01341X/1, MR/L01632X/1, MC_PC_12025] Funding Source: UKRI
  10. Medical Research Council [MR/L01341X/1, MR/L01632X/1, MC_PC_12025] Funding Source: researchfish
  11. National Institute for Health Research [NF-SI-0611-10136] Funding Source: researchfish

向作者/读者索取更多资源

Large-scale metabolomics studies involving thousands of samples present multiple challenges in data analysis, particularly when an untargeted platform is used. Studies with multiple cohorts and analysis platforms exacerbate existing problems such as peak alignment and normalization. Therefore, there is a need for robust processing pipelines that can ensure reliable data for statistical analysis. The COMBI-BIO project incorporates serum from similar to 8000 individuals, in three cohorts, profiled by six assays in two phases using both H-1 NMR and UPLC MS. Here we present the COMBI-BIO NMR analysis pipeline and demonstrate its fitness for purpose using representative quality control (QC) samples. NMR spectra were first aligned and normalized. After eliminating interfering signals, outliers identified using Hotelling's T-2 were removed and a cohort/phase adjustment was applied, resulting in two NMR data sets (CPMG and NOESY). Alignment of the NMR data was shown to increase the correlation-based alignment quality measure from 0.319 to 0.391 for CPMG and from 0.536 to 0.586 for NOESY, showing that the improvement was present across both large and small peaks. End-to-end quality assessment of the pipeline was achieved using Hotelling's T2 distributions. For CPMG spectra, the interquartile range decreased from 1.425 in raw QC data to 0.679 in processed spectra, while the corresponding change for NOESY spectra was from 0.795 to 0.636, indicating an improvement in precision following processing. PCA indicated that gross phase and cohort differences were no longer present. These results illustrate that the pipeline produces robust and reproducible data, successfully addressing the methodological challenges of this large multifaceted study.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据