4.7 Article

Violations of proportional hazard assumption in Cox regression model of transcriptomic data in TCGA pan-cancer cohorts

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.csbj.2022.01.004

Keywords

Proportional hazard assumption; Cox regression; Transcriptome; TCGA; Pan-cancer

Funding

  1. National Natural Science Foundation of China [81773236, 81800429, 81972852]
  2. Key Research & Development Project of Hubei Province [2020BCA069]
  3. Nature Science Foundation of Hubei Province [2020CFB612]
  4. Health Commission of Hubei Province Medical Leading Talent Project
  5. Young and Middle-Aged Medical Backbone Talents of Wuhan [WHQG201902]
  6. Application Foundation Frontier Project of Wuhan [2020020601012221]
  7. Zhongnan Hospital of Wuhan University Science, Technology and Innovation Seed Fund [znpy2019001, znpy2019048, ZNJC201922]
  8. Medical Sci-Tech Innovation Platform of Zhongnan Hospital, Wuhan University [PTXM2019026]
  9. Chinese Society of Clinical Oncology TopAlliance Tumor Immune Research Fund [Y-JS2019-036]

Ask authors/readers for more resources

This study comprehensively investigates the proportional hazard assumption in transcriptomic data and finds that non-proportional hazards cannot be ignored. Introducing time interaction terms in the Cox proportional hazard regression model improves the performance and interpretability of non-proportional hazards in transcriptomic data.
Background: Cox proportional hazard regression (CPH) model relies on the proportional hazard (PH) assumption: the hazard of variables is independent of time. CPH has been widely used to identify prognostic markers of the transcriptome. However, the comprehensive investigation on PH assumption in transcriptomic data has lacked. Results: The whole transcriptomic data of the 9,056 patients from 32 cohorts of The Cancer Genome Atlas and the 3 lung cancer cohorts from Gene Expression Omnibus were collected to construct CPH model for each gene separately for fitting the overall survival. An average of 8.5% gene CPH models violated the PH assumption in TCGA pan-cancer cohorts. In the gene interaction networks, both hub and non-hub genes in CPH models were likely to have non-proportional hazards. Violations of PH assumption for the same gene models were not consistent in 5 non-small cell lung cancer datasets (all kappa coefficients < 0.2), indicating that the non-proportionality of gene CPH models depended on the datasets. Furthermore, the introduction of log(t) or sqrt(t) time-functions into CPH improved the performance of gene models on overall survival fitting in most tumors. The time-dependent CPH changed the significance of log hazard ratio of the 31.9% gene variables. Conclusions: Our analysis resulted that non-proportional hazards should not be ignored in transcriptomic data. Introducing time interaction term ameliorated performance and interpretability of non-proportional hazards of transcriptome data in CPH. (C) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available