4.7 Article

Survival analysis with semi-supervised predictive clustering trees

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 141, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2021.105001

Keywords

Survival analysis; Predictive clustering trees; Semi-supervised learning; Random forests; Random survival forests

Funding

  1. Slovenian Research Agency (ARRS) [P2-0103, N2-0128, J2-2505, J7-9400]
  2. TAILOR
  3. EU [3M180314, G080118 N]
  4. Internal Funds KU Leuven [952215]
  5. Research Fund Flanders [825619]
  6. Flemish Government (AI Research Program)

Ask authors/readers for more resources

This article treats the prediction of time-to-event as a multi-target regression task and models censored observations as partially labeled examples. By applying semi-supervised learning, a method with better predictive performance and smaller models is proposed. Additionally, the informative feature selection mechanism of the method is illustrated in the context of predicting survival for amyotrophic lateral sclerosis patients.
Many clinical studies follow patients over time and record the time until the occurrence of an event of interest (e. g., recovery, death, ...). When patients drop out of the study or when their event did not happen before the study ended, the collected dataset is said to contain censored observations. Given the rise of personalized medicine, clinicians are often interested in accurate risk prediction models that predict, for unseen patients, a survival profile, including the expected time until the event. Survival analysis methods are used to detect associations or compare subpopulations of patients in this context. In this article, we propose to cast the time-to-event prediction task as a multi-target regression task, with censored observations modeled as partially labeled examples. We then apply semi-supervised learning to the resulting data representation. More specifically, we use semi-supervised predictive clustering trees and ensembles thereof. Empirical results over eleven real-life datasets demonstrate superior or equivalent predictive performance of the proposed approach as compared to three competitor methods. Moreover, smaller models are obtained compared to random survival forests, another tree ensemble method. Finally, we illustrate the informative feature selection mechanism of our method, by interpreting the splits induced by a single tree model when predicting survival for amyotrophic lateral sclerosis patients.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available