4.7 Article

ppx: Programmatic Access to Proteomics Data Repositories

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 20, Issue 9, Pages 4621-4624

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.1c00454

Keywords

proteomics; mass spectrometry; bioinformatics; reproducibility; repository; data sharing; data dissemination; data access; Python; FAIR

Funding

  1. National Institutes of Health [T32HG000035, P41GM103533, R01GM121818]
  2. Research Foundation-Flanders [FWO 12W0418N]

Ask authors/readers for more resources

The volume of proteomics and mass spectrometry data in public repositories is growing rapidly as more researchers embrace open science practices. The ppx Python package provides easy access to data in ProteomeXchange repositories and enhances reproducible research by allowing reanalysis of published datasets. This tool is valuable for creating reproducible analyses and providing easy access to data for tool developers.
The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computational tools. Here, we present ppx, a Python package that provides easy, programmatic access to the data stored in ProteomeXchange repositories, such as PRIDE and MassIVE. The ppx package can be used as either a command line tool or a Python package to retrieve the files and metadata associated with a project when provided its identifier. To demonstrate how ppx enhances reproducible research, we used ppx within a Snakemake workflow to reanalyze a published data set with the open modification search tool ANN-SoLo and compared our reanalysis to the original results. We show that ppx readily integrates into workflows, and our reanalysis produced results consistent with the original analysis. We envision that ppx will be a valuable tool for creating reproducible analyses, providing tool developers easy access to data for development, testing, and benchmarking, and enabling the use of mass spectrometry data in data-intensive analyses. The ppx package is freely available and open source under the MIT license at https://github.com/wfondrie/ppx.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available