4.8 Article

Reducing Data Complexity Using Autoencoders With Class-Informed Loss Functions

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2021.3127698

关键词

Complexity theory; Feature extraction; Measurement; Shape; Support vector machines; Data models; Transforms; Autoencoders; dimension reduction; data complexity

资金

  1. Spanish Ministry of Science under the FPU Program [FPU17/04069]
  2. Spanish Ministry of Science project [PID2020-119478GB-I00, PID2019-107793GB-I00]
  3. Andalusian Excellence project [P18-FR-4961]
  4. project DeepSCOP Ayudas Fundacion BBVA a Equipos de Investigacion Cientifica en Big Data 2018

向作者/读者索取更多资源

This paper proposes an autoencoder-based approach to complexity reduction, using class labels to generate more suitable variables. Experimental results show that class-informed autoencoders perform better than other unsupervised feature extraction techniques, especially in classification tasks.
Available data in machine learning applications is becoming increasingly complex, due to higher dimensionality and difficult classes. There exists a wide variety of approaches to measuring complexity of labeled data, according to class overlap, separability or boundary shapes, as well as group morphology. Many techniques can transform the data in order to find better features, but few focus on specifically reducing data complexity. Most data transformation methods mainly treat the dimensionality aspect, leaving aside the available information within class labels which can be useful when classes are somehow complex. This paper proposes an autoencoder-based approach to complexity reduction, using class labels in order to inform the loss function about the adequacy of the generated variables. This leads to three different new feature learners, Scorer, Skaler and Slicer. They are based on Fisher's discriminant ratio, the Kullback-Leibler divergence and least-squares support vector machines, respectively. They can be applied as a preprocessing stage for a binary classification problem. A thorough experimentation across a collection of 27 datasets and a range of complexity and classification metrics shows that class-informed autoencoders perform better than 4 other popular unsupervised feature extraction techniques, especially when the final objective is using the data for a classification task.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据