期刊
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
卷 18, 期 5, 页码 1821-1830出版社
IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2019.2961667
关键词
Biological system modeling; Predictive models; Hazards; Mathematical model; Adaptation models; Computational modeling; Genomics; Cox model; regularization; variable selection; gene expression
类别
资金
- MOE (Ministry of Education in China) Project of Humanities and Social Sciences [18YJCZH054]
- National Natural Science Foundation of Guangdong [2018A030307033]
- Special Innovation Projects of Universities in Guangdong Province [2018KTSCX205]
- High-level College's Talent Project of Guangdong [2013178]
- Macau Science and Technology Development Funds [0002/2019/APD]
In this paper, a novel Cox model strategy combining self-paced learning (SPL) and SCAD-Net penalty is proposed, showing superior performance in prediction and gene selection through simulation and large-scale experimental analysis compared to traditional models.
The Cox proportional hazards model is a popular method to study the connection between feature and survival time. Because of the high-dimensionality of genomic data, existing Cox models trained on any specific dataset often generalize poorly to other independent datasets. In this paper, we suggest a novel strategy for the Cox model. This strategy is included a new learning technique, self-paced learning (SPL), and a new gene selection method, SCAD-Net penalty. The SPL method is adopted to aid to build a more accurate prediction with its built-in mechanism of learning from easy samples first and adaptively learning from hard samples. The SCAD-Net penalty has fixed the problem of the SCAD method without an inherent mechanism to fuse the prior graphical information. We combined the SPL with the SCAD-Net penalty to the Cox model (SSNC). The simulation shows that the SSNC outperforms the benchmark in terms of prediction and gene selection. The analysis of a large-scale experiment across several cancer datasets shows that the SSNC method not only results in higher prediction accuracies but also identifies markers that satisfactory stability across another validation dataset. The demo code for the proposed method is provided in supplemental file.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据