4.7 Article

A Bayesian Perspective on Early Stage Event Prediction in Longitudinal Data

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2016.2608347

关键词

Bayesian network; Naive Bayes; longitudinal data; survival analysis; early stage prediction; regression; event data

资金

  1. US National Science Foundation [IIS-1231742, IIS-1527827, IIS-1646881]
  2. Div Of Information & Intelligent Systems
  3. Direct For Computer & Info Scie & Enginr [1231742, 1707498] Funding Source: National Science Foundation

向作者/读者索取更多资源

Predicting event occurrence at the early stage of a longitudinal study is an important and challenging problem which has high practical value in many real-world applications. As opposed to the standard classification and regression problems where a domain expert can provide labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of a sufficient number of events. Survival analysis aims at directly predicting the time to an event of interest using the data collected in the past for a certain duration. However, it cannot give an answer to the open question of how to forecast whether a subject will experience an event by end of a longitudinal study using event occurrence information of other subjects at the early stage of the study?. The goal of this work is to predict the event occurrence at a future time point using only the information about a limited number of events that occurred at the initial stages of a longitudinal study. This problem exhibits two major challenges: (1) absence of complete information about event occurrence (censoring) and (2) availability of only a partial set of events that occurred during the initial phase of the study. We propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at the early stages of longitudinal studies. First, we develop a novel approach to address the first challenge by introducing a new method for handling censored data using Kaplan-Meier estimator. We then extend the Naive Bayes, Tree-Augmented Naive Bayes (TAN), and Bayesian Network methods based on the proposed framework, and develop three algorithms, namely, ESP-NB, ESP-TAN, and ESPBN, to effectively predict event occurrence using training data obtained at an early stage of the study. More specifically, our approach effectively integrates Bayesian methods with an Accelerated Failure Time (AFT) model by adapting the prior probability of the event occurrence for future time points. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is, on an average, 20 percent more accurate compared to existing schemes when using only limited event information in the training data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据