4.6 Article

Gene expression profiling of 1200 pancreatic ductal adenocarcinoma reveals novel subtypes

Journal

BMC CANCER
Volume 18, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s12885-018-4546-8

Keywords

Pancreatic ductal adenocarcinoma; Heterogeneity; Biclustering; Subtype; Deep learning; Biomarkers

Categories

Funding

  1. Hong Kong Research Grants Council [CityU 11214814, C1007-15G]
  2. City University of Hong Kong [7004862]

Ask authors/readers for more resources

Background: Pancreatic ductal adenocarcinoma (PDAC) is the fourth leading cause of cancer related death in the world with a five-year survival rate of less than 5%. Not all PDAC are the same, because there exist intra-tumoral heterogeneity between PDAC, which poses a great challenge to personalized treatments for PDAC. Methods: To dissect the molecular heterogeneity of PDAC, we performed a retrospective meta-analysis on whole transcriptome data from more than 1200 PDAC patients. Subtypes were identified based on non-negative matrix factorization (NMF) biclustering method. We used the gene set enrichment analysis (GSEA) and survival analysis to conduct the molecular and clinical characterization of the identified subtypes, respectively. Results: Six molecular and clinical distinct subtypes of PDAC: L1-L6, are identified and grouped into tumor-specific (L1, L2 and L6) and stroma-specific subtypes (L3, L4 and L5). For tumor-specific subtypes, L1 (similar to 22%) has enriched carbohydrate metabolism-related gene sets and has intermediate survival. L2 (similar to 22%) has the worst clinical outcomes, and is enriched for cell proliferation-related gene sets. About 23% patients can be classified into L6, which leads to intermediate survival and is enriched for lipid and protein metabolism-related gene sets. Stroma-specific subtypes may contain high non-epithelial contents such as collagen, immune and islet cells, respectively. For instance, L3 (similar to 12%) has poor survival and is enriched for collagen-associated gene sets. L4 (similar to 14%) is enriched for various immune-related gene sets and has relatively good survival. And L5 (similar to 7%) has good clinical outcomes and is enriched for neurotransmitter and insulin secretion related gene sets. In the meantime, we identified 160 subtype-specific markers and built a deep learning-based classifier for PDAC. We also applied our classification system on validation datasets and observed much similar molecular and clinical characteristics between subtypes. Conclusions: Our study is the largest cohort of PDAC gene expression profiles investigated so far, which greatly increased the statistical power and provided more robust results. We identified six molecular and clinical distinct subtypes to describe a more complete picture of the PDAC heterogeneity. The 160 subtype-specific markers and a deep learning based classification system may be used to better stratify PDAC patients for personalized treatments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available