4.8 Article

The ProteomeXchange consortium at 10 years: 2023 update

Related references

Note: Only part of the references are listed.
Article Biochemical Research Methods

Integrated View of Baseline Protein Expression in Human Tissues

Ananth Prakash et al.

Summary: The availability of proteomics datasets, especially in the PRIDE database, has significantly increased in recent years, providing an opportunity for combined analyses of datasets to obtain organism-wide protein abundance data. In this study, we reanalyzed 24 public proteomics datasets to assess baseline protein abundance in 31 organs of healthy individuals. We compared protein abundances between organs, studied protein distribution, and performed gene ontology and pathway-enrichment analyses. The results are integrated into the Expression Atlas resource to enhance accessibility for life scientists.

JOURNAL OF PROTEOME RESEARCH (2023)

Article Biochemical Research Methods

Is DIA proteomics data FAIR? Current data sharing practices, available bioinformatics infrastructure and recommendations for the future

Andrew R. Jones et al.

Summary: DIA proteomics techniques have made significant progress in recent years, but there is still room for improvement in terms of FAIR data principles. To enhance the current situation for DIA data, recommendations include developing an open data standard for spectral libraries, mandating the availability of spectral libraries in ProteomeXchange resources, improving support for DIA data in data standards, and enhancing support for DIA datasets in ProteomeXchange resources.

PROTEOMICS (2023)

Article Biochemistry & Molecular Biology

The UCSC Genome Browser database: 2022 update

Brian T. Lee et al.

Summary: The UCSC Genome Browser is a graphical viewer for exploring genome annotations, providing integrated tools for visualizing, comparing, analyzing, and sharing genomic datasets. Updates this year include new public hub assemblies for new organisms, updated clinical tracks, a new Track Sets feature, enhanced variant displays, and a tool for placing new SARS-CoV-2 genomes in a global phylogenetic tree. Other improvements focus on usability features such as informative mouseover displays and new fonts.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

Expression Atlas update: gene and protein expression in multiple species

Pablo Moreno et al.

Summary: EMBL-EBI Expression Atlas is a knowledge base that integrates data from over 4500 expression studies across different species, enabling researchers to explore the expression patterns of genes or proteins under various biological conditions. The data are curated by experts, re-analyzed, and visualized in an easily accessible form, aiming to reproduce the original conclusions of the experiments.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

ProteomicsDB: toward a FAIR open-source resource for life-science research

Ludwig Lautenbacher et al.

Summary: ProteomicsDB is a multi-omics and multi-organism resource for life science research, with efforts to improve the findability, accessibility, interoperability and reusability of data. New API and UI have been released, along with content expansions into different human biology and a newly supported organism.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

DNA Data Bank of Japan (DDBJ) update report 2021

Toshihisa Okido et al.

Summary: The Bioinformation and DDBJ (DNA Data Bank of Japan) Center operates archival databases and provides services for life science researchers, including nucleotide sequences, study information, and other genomic data-related services.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences

Yasset Perez-Riverol et al.

Summary: PRIDE is the largest data repository of mass spectrometry-based proteomics data in the world, with around 500 datasets submitted per month. In addition to continuous improvements in data pipelines and infrastructure, PRIDE has developed the Spectra Archive and MAGE-TAB file format to enhance sample metadata annotation and access to mass spectra.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

The European Genome-phenome Archive in 2021

Mallory Ann Freeberg et al.

Summary: The European Genome-phenome Archive (EGA) is a resource for secure archiving of genetic, phenotypic, and clinical data, promoting data reuse, reproducibility, and accelerating biomedical research. EGA operates a distributed data access model, providing strong data protection control.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

iProX in 2021: connecting proteomics data sharing with big data

Tao Chen et al.

Summary: The iProX integrated proteome resource has been greatly improved with an up-to-date big data platform to support large-scale data storage, efficient querying, and reanalysis, meeting the demands of the rapidly growing field of proteomics.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

Ensembl 2022

Fiona Cunningham et al.

Summary: Ensembl is unique in its flexible infrastructure for access to genomic data and annotation. They have focused on expediting annotation of new assemblies via the Ensembl Rapid Release platform, with the greatest annual number of newly annotated genomes released. They also developed a new method for comparative analyses and annotated non-vertebrate eukaryotes for the first time.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemical Research Methods

Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms

Richard D. LeDuc et al.

Summary: It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence. The ProForma 2.0 notation aims to unify the representation of proteoforms and peptidoforms, supporting different proteomics approaches and allowing the encoding of highly modified proteins and peptides using a human- and machine-readable string.

JOURNAL OF PROTEOME RESEARCH (2022)

Article Biochemical Research Methods

Method for Independent Estimation of the False Localization Rate for Phosphoproteomics

Kerry A. Ramsbottom et al.

Summary: Phosphoproteomic methods are used to identify and quantify phosphorylation sites on proteins. This study introduces the concept of scoring modifications on a decoy amino acid to estimate the global false localization rate (FLR) independently.

JOURNAL OF PROTEOME RESEARCH (2022)

Article Multidisciplinary Sciences

The PeptideAtlas of a widely cultivated fish Labeo rohita: A resource for the Aquaculture Community

Mehar Un Nissa et al.

Summary: Rohu is an important fish species in aquaculture, and integrative omics research provides a platform for understanding its biology. By utilizing mass spectrometry-based proteomics, an open source PeptideAtlas for Rohu has been developed, which has significant implications for aquaculture research and addressing food security challenges.

SCIENTIFIC DATA (2022)

Article Multidisciplinary Sciences

Unifying the identification of biomedical entities with the Bioregistry

Charles Tapley Hoyt et al.

Summary: The standardized identification of biomedical entities is important for interoperability and data integration in the life sciences. The Bioregistry is an integrative and open metaregistry that expands upon existing registries to address the evolving needs of researchers. By leveraging public infrastructure and automation, and employing an open code and open data governance model, the Bioregistry promotes interoperability and reuse of data and scientific literature.

SCIENTIFIC DATA (2022)

Letter Biotechnology & Applied Microbiology

Standardized annotation of translated open reading frames

Jonathan M. Mudge et al.

NATURE BIOTECHNOLOGY (2022)

Article Biochemical Research Methods

Integrated view and comparative analysis of baseline protein expression in mouse and rat tissues

Shengbo Wang et al.

Summary: By reanalyzing public proteomics datasets, the baseline protein abundance in mouse and rat tissues was assessed and compared across different organs and species. The findings were integrated into the Expression Atlas resource for dissemination.

PLOS COMPUTATIONAL BIOLOGY (2022)

Article Multidisciplinary Sciences

Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas

Mathias Walzer et al.

Summary: This study introduces a re-analysis pipeline for public SWATH-MS datasets, which includes automated workflows and statistical analysis, and integrates the results into the Expression Atlas resource. By reanalysing 10 public DIA datasets, the robustness of the pipeline was validated, and the final results were integrated into Expression Atlas.

SCIENTIFIC DATA (2022)

Article Biochemical Research Methods

Identifiers.org: Compact Identifier services in the cloud

Manuel Bernal-Llinares et al.

Summary: Identifiers.org is a key tool for the annotation and cross-referencing of Life Science data, offering services to construct and resolve globally unique identifiers CID. They continuously improve services to support the growing demand, and deploy new infrastructure in a commercial cloud environment to provide high availability and low-latency services.

BIOINFORMATICS (2021)

Article Biochemistry & Molecular Biology

UniProt: the universal protein knowledgebase in 2021

Alex Bateman et al.

Summary: The UniProt Knowledgebase aims to provide users with a comprehensive, high-quality set of protein sequences annotated with functional information. Updates over the past two years have increased the number of sequences to approximately 190 million, with new methods to assess proteome completeness and quality. UniProtKB has responded to the COVID-19 pandemic by expertly curating relevant entries and making them rapidly available through a dedicated portal.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

Data Management of Sensitive Human Proteomics Data: Current Practices, Recommendations, and Perspectives for the Future

Nuno Bandeira et al.

Summary: The increase in clinical proteomics studies has raised concerns about managing and disseminating potentially sensitive human proteomics data. Balancing data privacy with efficient use and reuse of research efforts through sharing clinical proteomics data will require development efforts at different levels including bioinformatics infrastructure, policymaking, and mechanisms of oversight.

MOLECULAR & CELLULAR PROTEOMICS (2021)

Editorial Material Multidisciplinary Sciences

The growing need for controlled data access models in clinical proteomics and metabolomics

Thomas M. Keane et al.

Summary: This commentary discusses the current best practices and future perspectives for responsible handling of clinical proteomics and metabolomics data, emphasizing the lack of bioinformatics resources available to manage access to sensitive human datasets in clinical studies.

NATURE COMMUNICATIONS (2021)

Review Multidisciplinary Sciences

A proteomics sample metadata representation for multiomics integration and big data analysis

Chengxin Dai et al.

Summary: The authors proposed a format and software pipeline for presenting and validating metadata of proteomics datasets, integrating them into ProteomeXchange repositories. They implemented MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets, aiming to improve reproducibility and facilitate reanalysis and integration of public proteomics datasets.

NATURE COMMUNICATIONS (2021)

Article Biochemical Research Methods

A wide-ranging Pseudomonas aeruginosa PeptideAtlas build: A useful proteomic resource for a versatile pathogen

J. A. Reales-Calderon et al.

Summary: Pseudomonas aeruginosa is a significant opportunistic human pathogen, and the study of its proteome is crucial for uncovering virulence factors and antibiotic resistance mechanisms. The construction of proteomic resources like PeptideAtlas enables targeted proteomics studies, system-wide observations, and cross-species observational studies.

JOURNAL OF PROTEOMICS (2021)

Article Multidisciplinary Sciences

An integrated landscape of protein expression in human cancer

Andrew F. Jarnuczak et al.

Summary: Utilizing 11 proteomics datasets from the PRIDE database, a reference expression map was constructed for 191 cancer cell lines and 246 clinical tumor samples, revealing unique peptides in tumor samples and highlighting the correlation between baseline expression in cell lines and tumors. Integration of proteomics and transcriptomics data showed a median correlation of 0.58 across cell lines, indicating that mRNA levels are often a poor predictor of changes in protein abundance. This study represents the first meta-analysis focusing on cancer-related public proteomics datasets, emphasizing the shortcomings and limitations of such studies.

SCIENTIFIC DATA (2021)

Article Biochemical Research Methods

Universal Spectrum Identifier for mass spectra

Eric W. Deutsch et al.

Summary: The Universal Spectrum Identifier (USI) provides a standardized mechanism for encoding virtual paths to mass spectra in public repositories, enabling greater transparency and traceability of spectral evidence. Over 1 billion USI identifications from more than 3 billion spectra are already available through ProteomeXchange repositories, supporting the findings of mass spectrometry proteomics studies.

NATURE METHODS (2021)

Article Biochemistry & Molecular Biology

Artificial intelligence for proteomics and biomarker discovery

Matthias Mann et al.

Summary: The rapid growth of biomedical data generation and computational capabilities has led to advancements in utilizing machine learning and deep learning in proteomics for predictive modeling and biomarker discovery. These technologies are essential for improving analytical workflows and integrating multi-omics data, while also raising concerns about model transparency, explainability, and data privacy when deploying MS-based biomarkers in clinical settings.

CELL SYSTEMS (2021)

Article Biochemistry & Molecular Biology

The Arabidopsis PeptideAtlas: Harnessing worldwide proteomics data to create a comprehensive community proteomics resource

Klaas J. van Wijk et al.

Summary: The Arabidopsis PeptideAtlas is a resource that addresses key questions about the Arabidopsis thaliana proteome by analyzing published mass spectrometry data and providing reliable information about specific proteins. It identifies additional proteins and isoforms not currently in Araport11, evaluates physicochemical protein properties, and integrates with community resources for global access. The resource allows for targeted identification of unobserved proteins and serves as a platform for future incorporation of millions more MS/MS data.

PLANT CELL (2021)

Article Biochemistry & Molecular Biology

OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes

Marie A. Brunet et al.

Summary: OpenProt is the first proteogenomic resource that supports a polycistronic annotation model for eukaryotic genomes, providing deeper annotation of open reading frames (ORFs) with supporting evidence from experimental data. The platform re-analyzes ribosome profiling and mass spectrometry datasets to report non-AUG initiation starts and control the unicity of detected peptides. In addition, detectability statistics and protein relationships are now reported for each protein, and a data analysis platform is offered for users to submit their datasets for analysis and access the results.

NUCLEIC ACIDS RESEARCH (2021)

Review Spectroscopy

The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics

Lindsay K. Pino et al.

MASS SPECTROMETRY REVIEWS (2020)

Article Biochemistry & Molecular Biology

MatrisomeDB: the ECM-protein knowledge database

Xinhao Shao et al.

NUCLEIC ACIDS RESEARCH (2020)

Article Biotechnology & Applied Microbiology

The functional landscape of the human phosphoproteome

David Ochoa et al.

NATURE BIOTECHNOLOGY (2020)

Article Biochemical Research Methods

Scop3P: A Comprehensive Resource of Human Phosphosites within Their Full Context

Pathmanaban Ramasamy et al.

JOURNAL OF PROTEOME RESEARCH (2020)

Article Biochemical Research Methods

MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets

Meena Choi et al.

NATURE METHODS (2020)

Article Multidisciplinary Sciences

DIALib-QC an assessment tool for spectral libraries in data-independent acquisition proteomics

Mukul K. Midha et al.

NATURE COMMUNICATIONS (2020)

Article Multidisciplinary Sciences

A high-stringency blueprint of the human proteome

Subash Adhikari et al.

NATURE COMMUNICATIONS (2020)

Article Biochemistry & Molecular Biology

The jPOST environment: an integrated proteomics data repository and database

Yuki Moriya et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biotechnology & Applied Microbiology

Co-regulation map of the human proteome enables identification of protein functions

Georg Kustatscher et al.

NATURE BIOTECHNOLOGY (2019)

Article Multidisciplinary Sciences

Quantifying the impact of public omics data

Yasset Perez-Riverol et al.

NATURE COMMUNICATIONS (2019)

Article Biochemistry & Molecular Biology

LNCipedia 5: towards a reference set of human long non-coding RNAs

Pieter-Jan Volders et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biochemistry & Molecular Biology

The SysteMHC Atlas project

Wenguang Shao et al.

NUCLEIC ACIDS RESEARCH (2018)

Article Biochemical Research Methods

Panorama Public: A Public Repository for Quantitative Data Sets Processed in Skyline

Vagisha Sharma et al.

MOLECULAR & CELLULAR PROTEOMICS (2018)

Article Biochemistry & Molecular Biology

Assembling the Community-Scale Discoverable Human Proteome

Mingxun Wang et al.

CELL SYSTEMS (2018)

Article Biochemistry & Molecular Biology

jPOSTrepo: an international standard data repository for proteomes

Shujiro Okuda et al.

NUCLEIC ACIDS RESEARCH (2017)

Article Biochemistry & Molecular Biology

The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition

Eric W. Deutsch et al.

NUCLEIC ACIDS RESEARCH (2017)

Review Biochemical Research Methods

Proteomics Standards Initiative: Fifteen Years of Progress and Future Work

Eric W. Deutsch et al.

JOURNAL OF PROTEOME RESEARCH (2017)

Article Biochemical Research Methods

The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics

Juan Antonio Vizcaino et al.

MOLECULAR & CELLULAR PROTEOMICS (2017)

Article Biochemistry & Molecular Biology

sORFs.org: a repository of small ORFs identified by ribosome profiling

Volodimir Olexiouk et al.

NUCLEIC ACIDS RESEARCH (2016)

Article Multidisciplinary Sciences

Comment: The FAIR Guiding Principles for scientific data management and stewardship

Mark D. Wilkinson et al.

SCIENTIFIC DATA (2016)

Review Computer Science, Information Systems

Development of data representation standards by the human proteome organization proteomics standards initiative

Eric W. Deutsch et al.

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2015)

Letter Biotechnology & Applied Microbiology

ProteomeXchange provides globally coordinated proteomics data submission and dissemination

Juan A. Vizcaino et al.

NATURE BIOTECHNOLOGY (2014)

Article Biochemistry & Molecular Biology

NCBI's Database of Genotypes and Phenotypes: dbGaP

Kimberly A. Tryka et al.

NUCLEIC ACIDS RESEARCH (2014)

Article Biochemical Research Methods

Fast Multi-blind Modification Search through Tandem Mass Spectrometry

Seungjin Na et al.

MOLECULAR & CELLULAR PROTEOMICS (2012)

Article Biochemical Research Methods

PASSEL: The PeptideAtlas SRM experiment library

Terry Farrah et al.

PROTEOMICS (2012)

Article Biochemical Research Methods

mzML-a Community Standard for Mass Spectrometry Data

Lennart Martens et al.

MOLECULAR & CELLULAR PROTEOMICS (2011)

Review Biochemistry & Molecular Biology

PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows

Eric W. Deutsch et al.

EMBO REPORTS (2008)

Article Biochemical Research Methods

Open source system for analyzing, validating, and storing protein identification data

R Craig et al.

JOURNAL OF PROTEOME RESEARCH (2004)