4.5 Article

Manifold-based denoising, outlier detection, and dimension reduction algorithm for high-dimensional data

Journal

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-023-01873-y

Keywords

Noise reduction; Outlier detection; Manifold learning

Ask authors/readers for more resources

Manifold learning plays an increasingly important role in machine learning, but its dimensionality reduction effect is reduced by inevitable noises and outliers that destroy the manifold structure of data. Therefore, this paper proposes a denoising algorithm based on manifold learning for high-dimensional data. The algorithm first projects noisy sample vectors onto the local manifold to achieve noise reduction. Then, statistical analysis of noises is performed to obtain a data boundary. Outliers, which are sample vectors outside the data boundary, are marked and eliminated. Finally, dimension reduction is performed on the data after noise reduction and outlier detection. Experimental results show that the algorithm can effectively eliminate the interference of noises and outliers in high-dimensional datasets to some extent for manifold learning.
Manifold learning, which has emerged in recent years, plays an increasingly important role in machine learning. However, because inevitable noises and outliers destroy the manifold structure of data, the dimensionality reduction effect of manifold learning will be reduced. Therefore, this paper proposes a denoising algorithm for high-dimensional data based on manifold learning. The algorithm first projects noisy sample vectors onto the local manifold, thereby achieving noise reduction. Then, a statistical analysis of noises is performed to obtain a data boundary. Because all the data come from the same background and obey the same distribution, the sample vectors that are not within the data boundary are marked as outliers, and these outliers are eliminated. Finally, the dimension reduction of the data after noise reduction and outlier detection is performed. Experimental results show that the algorithm can effectively eliminate the interference of noises and outliers in high-dimensional datasets to some extent for manifold learning.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available