期刊
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT I
卷 10404, 期 -, 页码 284-297出版社
SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-319-62392-4_21
关键词
Missing data; Dimensionality reduction; Diffusion maps; Laplacian pyramids
A challenging problem in machine learning is handling missing data, also known as imputation. Simple imputation techniques complete the missing data by the mean or the median values. A more sophisticated approach is to use regression to predict the missing data from the complete input columns. In case the dimension of the input data is high, dimensionality reduction methods may be applied to compactly describe the complete input. Then, a regression from the low-dimensional space to the incomplete data column can be constructed from imputation. In this work, we propose a two-step algorithm for data completion. The first step utilizes a non-linear manifold learning technique, named diffusion maps, for reducing the dimension of the data. This method faithfully embeds complex data while preserving its geometric structure. The second step is the Laplacian pyramids multi-scale method, which is applied for regression. Laplacian pyramids construct kernels of decreasing scales to capture finer modes of the data. Experimental results demonstrate the efficiency of our approach on a publicly available dataset.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据