4.6 Review

Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial

Journal

MOLECULAR SYSTEMS BIOLOGY
Volume 17, Issue 8, Pages -

Publisher

WILEY
DOI: 10.15252/msb.202110240

Keywords

batch effects; data analysis; large-scale proteomics; normalization; quantitative proteomics

Funding

  1. European Union [668858]
  2. Swiss State Secretariat for Education, Research and Innovation (SERI) [15.0324-2]
  3. SNF [SNF IZLRZ3_163911]
  4. Personalized Health and Related Technologies (PHRT) strategic focus area of ETH
  5. Swiss National Science Foundation [3100A0-688 107679]
  6. European Research Council [ERC-20140AdG 670821]
  7. NIH [F32GM134599]

Ask authors/readers for more resources

This study discusses proteomic experiments involving hundreds of samples, and how to assess, normalize, and correct batch effects in proteomic data. The research reviews methodologies from various fields and offers solutions specific to challenges in proteomics, ultimately providing a set of techniques for controlling the quality of batch effect adjustment.
Advancements in mass spectrometry-based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much-needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step-by-step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, proBatch, containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available