Journal
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
Volume 20, Issue -, Pages 496-507Publisher
ELSEVIER
DOI: 10.1016/j.csbj.2022.01.004
Keywords
Proportional hazard assumption; Cox regression; Transcriptome; TCGA; Pan-cancer
Funding
- National Natural Science Foundation of China [81773236, 81800429, 81972852]
- Key Research & Development Project of Hubei Province [2020BCA069]
- Nature Science Foundation of Hubei Province [2020CFB612]
- Health Commission of Hubei Province Medical Leading Talent Project
- Young and Middle-Aged Medical Backbone Talents of Wuhan [WHQG201902]
- Application Foundation Frontier Project of Wuhan [2020020601012221]
- Zhongnan Hospital of Wuhan University Science, Technology and Innovation Seed Fund [znpy2019001, znpy2019048, ZNJC201922]
- Medical Sci-Tech Innovation Platform of Zhongnan Hospital, Wuhan University [PTXM2019026]
- Chinese Society of Clinical Oncology TopAlliance Tumor Immune Research Fund [Y-JS2019-036]
Ask authors/readers for more resources
This study comprehensively investigates the proportional hazard assumption in transcriptomic data and finds that non-proportional hazards cannot be ignored. Introducing time interaction terms in the Cox proportional hazard regression model improves the performance and interpretability of non-proportional hazards in transcriptomic data.
Background: Cox proportional hazard regression (CPH) model relies on the proportional hazard (PH) assumption: the hazard of variables is independent of time. CPH has been widely used to identify prognostic markers of the transcriptome. However, the comprehensive investigation on PH assumption in transcriptomic data has lacked. Results: The whole transcriptomic data of the 9,056 patients from 32 cohorts of The Cancer Genome Atlas and the 3 lung cancer cohorts from Gene Expression Omnibus were collected to construct CPH model for each gene separately for fitting the overall survival. An average of 8.5% gene CPH models violated the PH assumption in TCGA pan-cancer cohorts. In the gene interaction networks, both hub and non-hub genes in CPH models were likely to have non-proportional hazards. Violations of PH assumption for the same gene models were not consistent in 5 non-small cell lung cancer datasets (all kappa coefficients < 0.2), indicating that the non-proportionality of gene CPH models depended on the datasets. Furthermore, the introduction of log(t) or sqrt(t) time-functions into CPH improved the performance of gene models on overall survival fitting in most tumors. The time-dependent CPH changed the significance of log hazard ratio of the 31.9% gene variables. Conclusions: Our analysis resulted that non-proportional hazards should not be ignored in transcriptomic data. Introducing time interaction term ameliorated performance and interpretability of non-proportional hazards of transcriptome data in CPH. (C) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available