4.5 Article

Selective Imputation of Covariates in High Dimensional Censored Data

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 31, Issue 4, Pages 1397-1405

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2022.2035233

Keywords

Censored covariates; Nonparametric model; Random forest; Wireless networks

Funding

  1. Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Ask authors/readers for more resources

Efficient modeling of censored data is crucial for various applications. This article presents a selective multiple imputation approach for predictive modeling in the presence of high-dimensional censored data. The proposed method allows for iterative selection of covariates to impute, resulting in a fast and accurate predictive model. Compared to previous methods, this fully nonparametric approach is more flexible and achieves comparable accuracy with faster execution.
Efficient modeling of censored data, that is, data which are restricted by some detection limit or truncation, is important for many applications. Ignoring the censoring can be problematic as valuable information may be missing and restoration of these censored values may significantly improve the quality of models. There are many scenarios where one may encounter censored data: survival data, interval-censored data or data with a lower limit of detection. Strategies to handle censored data are plenty, however, little effort has been made to handle censored data of high dimension. In this article, we present a selective multiple imputation approach for predictive modeling when a larger number of covariates are subject to censoring. Our method allows for iterative, subject-wise selection of covariates to impute in order to achieve a fast and accurate predictive model. The algorithm furthermore selects values for imputation which are likely to provide important information if imputed. In contrast to previously proposed methods, our approach is fully nonparametric and therefore, very flexible. We demonstrate that, in comparison to previous work, our model achieves faster execution and often comparable accuracy in a simulated example as well as predicting signal strength in radio network data. for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available