4.7 Article

Simplified and Unified Access to Cancer Proteogenomic Data

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 20, Issue 4, Pages 1902-1910

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.0c00919

Keywords

data dissemination; data access; cancer; proteomics; genomics; proteogenomics; mass spectrometry; CPTAC; Python; R; reproducibility

Funding

  1. National Cancer Institute (NCI) CPTAC award [U24 CA210972]
  2. Simmons Center for Cancer Research

Ask authors/readers for more resources

Comprehensive cancer data sets generated by CPTAC offer potential for advancing cancer research. They have created an API for distributing processed data, facilitating data reuse and integration for analysis.
Comprehensive cancer data sets recently generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) offer great potential for advancing our understanding of how to combat cancer. These data sets include DNA, RNA, protein, and clinical characterization for tumor and normal samples from large cohorts of many different cancer types. The raw data are publicly available at various Cancer Research Data Commons. However, widespread reuse of these data sets is also facilitated by easy access to the processed quantitative data tables. We have created a data application programming interface (API) to distribute these processed tables, implemented as a Python package called cptac. We implement it such that users who prefer to work in R can easily use our package for data access and then transfer the data into R for analysis. Our package distributes the finalized processed CPTAC data sets in a consistent, up-to-date format. This consistency makes it easy to integrate the data with common graphing, statistical, and machine-learning packages for advanced analysis. Additionally, consistent formatting across all cancer types promotes the investigation of pan-cancer trends. The data API structure of directly streaming data within a programming environment enhances the reproducibility. Finally, with the accompanying tutorials, this package provides a novel resource for cancer research education. View the software documentation at https://paynelab.github.io/cptac/. View the GitHub repository at https://github.com/PayneLab/cptac.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available