☆ 4.6 Article

An Overview of Unsupervised Deep Feature Representation for Text Categorization

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS (2019)

期刊

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

卷 6, 期 3, 页码 504-517

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSS.2019.2910599

关键词

Autoencoder; deconvolutional network; deep belief nets; deep learning; feature representation; text categorization; unsupervised learning

类别

Computer Science, Cybernetics Computer Science, Information Systems

资金

National Natural Science Foundation of China [U1705262, 61672159]
Technology Innovation Platform Project of Fujian Province [2014H2005, 2009J1007]
Fujian Collaborative Innovation Center for Big Data Application in Governments
Fujian Engineering Research Center of Big Data Analysis and Processing

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

High-dimensional features are extensively accessible in machine learning and computer vision areas. How to learn an efficient feature representation for specific learning tasks is invariably a crucial issue. Due to the absence of class label information, unsupervised feature representation is exceedingly challenging. In the last decade, deep learning has captured growing attention from researchers in a broad range of areas. Most of the deep learning methods are supervised, which is required to be fed with a large amount of accurately labeled data points. Nevertheless, acquiring sufficient accurately labeled data is unaffordable in numerous real-world applications, which is suggestive of the needs of unsupervised learning. Toward this end, quite a few unsupervised feature representation approaches based on deep learning have been proposed in recent years. In this paper, we attempt to provide a comprehensive overview of unsupervised deep learning methods and compare their performances in text categorization. Our survey starts with the autoencoder and its representative variants, including sparse autoencoder, stacked autoencoder, contractive autoencoder, denoising autoencoder, variational autoencoder, graph autoencoder, convolutional autoencoder, adversarial autoencoder, and residual autoencoder. Aside from autoencoders, deconvolutional networks, restricted Boltzmann machines, and deep belief nets are introduced. Then, the reviewed unsupervised feature representation methods are compared in terms of text clustering. Extensive experiments in eight publicly available data sets of text documents are conducted to provide a fair test bed for the compared methods.

An Overview of Unsupervised Deep Feature Representation for Text Categorization

期刊

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An Overview of Unsupervised Deep Feature Representation for Text Categorization

期刊

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文