3.8 Article

A Combined Interpolation and Weighted K-Nearest Neighbours Approach for the Imputation of Longitudinal ICU Laboratory Data

Journal

JOURNAL OF HEALTHCARE INFORMATICS RESEARCH
Volume 4, Issue 2, Pages 174-188

Publisher

SPRINGERNATURE
DOI: 10.1007/s41666-020-00069-1

Keywords

Imputation; Interpolation; KNN; Clinical datasets; DACMI

Funding

  1. University of Padova [C94I19001730001]
  2. Italian Ministry of Education, University and Research (MIUR) under the initiative Departments of Excellence

Ask authors/readers for more resources

The presence of missing data is a common problem that affects almost all clinical datasets. Since most available data mining and machine learning algorithms require complete datasets, accurately imputing (i.e. filling in) the missing data is an essential step. This paper presents a methodology for the missing data imputation of longitudinal clinical data based on the integration of linear interpolation and a weighted K-Nearest Neighbours (KNN) algorithm. The Maximal Information Coefficient (MIC) values among features are employed as weights for the distance computation in the KNN algorithm in order to integrate intra- and inter-patient information. An interpolation-based imputation approach was also employed and tested both independently and in combination with the KNN algorithm. The final imputation is carried out by applying the best performing method for each feature. The methodology was validated on a dataset of clinical laboratory test results of 13 commonly measured analytes of patients in an intensive care unit (ICU) setting. The performance results are compared with those of 3D-MICE, a state-of-the-art imputation method for cross-sectional and longitudinal patient data. This work was presented in the context of the 2019 ICHI Data Analytics Challenge on Missing data Imputation (DACMI).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available