4.1 Article Data Paper

Dataset containing physiological amounts of spike-in proteins into murine C2C12 background as a ground truth quantitative LC-MS/MS reference

Journal

DATA IN BRIEF
Volume 43, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.dib.2022.108435

Keywords

Proteomics; Mass spectrometry; Protein spike-in dataset; Complex proteomics standard; Quantitative ground truth dataset; C2C12 cell line

Funding

  1. FoRUM Program of Ruhr University Bochum [F871-2016]
  2. Bundesministerium fur Bildung und Forschung (BMBF) [FKZ 031 A 534A]
  3. PURE (Protein Unit for Research in Europe), a project of North Rhine-Westphalia, German
  4. European Union H2020 project NISCI [681094]
  5. Internal Security Fund (ISF)
  6. European Union

Ask authors/readers for more resources

In this article, a data dependent acquisition (DDA) dataset is presented, which serves as a reference and ground truth quantitative dataset. The dataset can be used as a benchmark reference for any workflows working on DDA data and has the potential to be valuable in comparing samples measured with DDA and data independent acquisition (DIA). The dataset consists of 15 LC-MS/MS measurements with five distinct spike-in states, each with three replicates.
In this article, we present a data dependent acquisition (DDA) dataset which was generated as a reference and ground truth quantitative dataset. While initially used to compare samples measured with DDA and data independent acquisition (DIA) (Barkovits et al., 2020), the presented dataset holds poten-tial value as a benchmark reference for any workflows work-ing on DDA data. The entire dataset consists of 15 LC-MS/MS measurements composed of five distinct spike-in-states, each with three replicates. To generate the data set, a C2C12 (immortalized mouse myoblast) cell lysate was used as a complex background for five different states which were simulated by spiking 13 defined proteins at different concentrations. For this purpose, the cell lysate was used in a constant amount of 20 mu g for all samples and different amounts of the 13 selected proteins ranging from 0.1 to 10 pmol were added, reflecting physiological amounts of proteins. Afterwards, all samples were tryptically digested using the same method. From each sample 200 ng tryptic peptides were measured in triplicates on a Q Exactive HF (Thermo Fisher Scientific). The mass range for MS1 was set to 350-1400 m/z with a resolution of 60,0 0 0 at 200 m/z. HCD fragmentation of the Top10 abundant precursor ions was performed at 27% NCE. The fragment analysis (MS2) was performed with a res-olution of 30,0 0 0 at 200 m/z. Additionally to the raw files, the dataset contains centroided mzML files and spectrum identification results for peptide identifications performed by Mascot (Perkins et al., 1999), MS-GF+ (Kim et al., 2010) and X1Tandem (Craig and Beavis, 2004) for each separate MS analysis. The corresponding FASTA containing protein sequences as well as a combina-tion of all identification runs performed by PIA (Uszkoreit et al., 2019, 2015) and a peptide and protein quantification performed by OpenMS (Pfeuffer et al., 2017) is included. All data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository (Perez-Riverol et al., 2018) with the dataset identifier PXD012986. (C) 2022 The Author(s). Published by Elsevier Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available