4.6 Article

An Overview of Unsupervised Deep Feature Representation for Text Categorization

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSS.2019.2910599

关键词

Autoencoder; deconvolutional network; deep belief nets; deep learning; feature representation; text categorization; unsupervised learning

资金

  1. National Natural Science Foundation of China [U1705262, 61672159]
  2. Technology Innovation Platform Project of Fujian Province [2014H2005, 2009J1007]
  3. Fujian Collaborative Innovation Center for Big Data Application in Governments
  4. Fujian Engineering Research Center of Big Data Analysis and Processing

向作者/读者索取更多资源

High-dimensional features are extensively accessible in machine learning and computer vision areas. How to learn an efficient feature representation for specific learning tasks is invariably a crucial issue. Due to the absence of class label information, unsupervised feature representation is exceedingly challenging. In the last decade, deep learning has captured growing attention from researchers in a broad range of areas. Most of the deep learning methods are supervised, which is required to be fed with a large amount of accurately labeled data points. Nevertheless, acquiring sufficient accurately labeled data is unaffordable in numerous real-world applications, which is suggestive of the needs of unsupervised learning. Toward this end, quite a few unsupervised feature representation approaches based on deep learning have been proposed in recent years. In this paper, we attempt to provide a comprehensive overview of unsupervised deep learning methods and compare their performances in text categorization. Our survey starts with the autoencoder and its representative variants, including sparse autoencoder, stacked autoencoder, contractive autoencoder, denoising autoencoder, variational autoencoder, graph autoencoder, convolutional autoencoder, adversarial autoencoder, and residual autoencoder. Aside from autoencoders, deconvolutional networks, restricted Boltzmann machines, and deep belief nets are introduced. Then, the reviewed unsupervised feature representation methods are compared in terms of text clustering. Extensive experiments in eight publicly available data sets of text documents are conducted to provide a fair test bed for the compared methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据