4.7 Article Proceedings Paper

A Novel Cox Proportional Hazards Model for High-Dimensional Genomic Data in Cancer Prognosis

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2019.2961667

Keywords

Biological system modeling; Predictive models; Hazards; Mathematical model; Adaptation models; Computational modeling; Genomics; Cox model; regularization; variable selection; gene expression

Funding

  1. MOE (Ministry of Education in China) Project of Humanities and Social Sciences [18YJCZH054]
  2. National Natural Science Foundation of Guangdong [2018A030307033]
  3. Special Innovation Projects of Universities in Guangdong Province [2018KTSCX205]
  4. High-level College's Talent Project of Guangdong [2013178]
  5. Macau Science and Technology Development Funds [0002/2019/APD]

Ask authors/readers for more resources

In this paper, a novel Cox model strategy combining self-paced learning (SPL) and SCAD-Net penalty is proposed, showing superior performance in prediction and gene selection through simulation and large-scale experimental analysis compared to traditional models.
The Cox proportional hazards model is a popular method to study the connection between feature and survival time. Because of the high-dimensionality of genomic data, existing Cox models trained on any specific dataset often generalize poorly to other independent datasets. In this paper, we suggest a novel strategy for the Cox model. This strategy is included a new learning technique, self-paced learning (SPL), and a new gene selection method, SCAD-Net penalty. The SPL method is adopted to aid to build a more accurate prediction with its built-in mechanism of learning from easy samples first and adaptively learning from hard samples. The SCAD-Net penalty has fixed the problem of the SCAD method without an inherent mechanism to fuse the prior graphical information. We combined the SPL with the SCAD-Net penalty to the Cox model (SSNC). The simulation shows that the SSNC outperforms the benchmark in terms of prediction and gene selection. The analysis of a large-scale experiment across several cancer datasets shows that the SSNC method not only results in higher prediction accuracies but also identifies markers that satisfactory stability across another validation dataset. The demo code for the proposed method is provided in supplemental file.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available