Journal
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
Volume 18, Issue 5, Pages 1821-1830Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2019.2961667
Keywords
Biological system modeling; Predictive models; Hazards; Mathematical model; Adaptation models; Computational modeling; Genomics; Cox model; regularization; variable selection; gene expression
Categories
Funding
- MOE (Ministry of Education in China) Project of Humanities and Social Sciences [18YJCZH054]
- National Natural Science Foundation of Guangdong [2018A030307033]
- Special Innovation Projects of Universities in Guangdong Province [2018KTSCX205]
- High-level College's Talent Project of Guangdong [2013178]
- Macau Science and Technology Development Funds [0002/2019/APD]
Ask authors/readers for more resources
In this paper, a novel Cox model strategy combining self-paced learning (SPL) and SCAD-Net penalty is proposed, showing superior performance in prediction and gene selection through simulation and large-scale experimental analysis compared to traditional models.
The Cox proportional hazards model is a popular method to study the connection between feature and survival time. Because of the high-dimensionality of genomic data, existing Cox models trained on any specific dataset often generalize poorly to other independent datasets. In this paper, we suggest a novel strategy for the Cox model. This strategy is included a new learning technique, self-paced learning (SPL), and a new gene selection method, SCAD-Net penalty. The SPL method is adopted to aid to build a more accurate prediction with its built-in mechanism of learning from easy samples first and adaptively learning from hard samples. The SCAD-Net penalty has fixed the problem of the SCAD method without an inherent mechanism to fuse the prior graphical information. We combined the SPL with the SCAD-Net penalty to the Cox model (SSNC). The simulation shows that the SSNC outperforms the benchmark in terms of prediction and gene selection. The analysis of a large-scale experiment across several cancer datasets shows that the SSNC method not only results in higher prediction accuracies but also identifies markers that satisfactory stability across another validation dataset. The demo code for the proposed method is provided in supplemental file.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available