4.6 Review

Feature selection for text classification: A review

期刊

MULTIMEDIA TOOLS AND APPLICATIONS
卷 78, 期 3, 页码 3797-3816

出版社

SPRINGER
DOI: 10.1007/s11042-018-6083-5

关键词

Feature Selection; Text classification; Text classifiers; Multimedia

资金

  1. National Key RD Plan of China [2017YFB0802203, 2018YFB100013]
  2. National Natural Science Foundation of China [U1736203, 61732021, 61472165, 61373158, 61363009]
  3. Guangdong Provincial Engineering Technology Research Center on Network Security Detection and Defense [2014B090904067]
  4. Guangdong Provincial Special Funds for Applied Technology Research and Development and Transformation of Important Scientific and Technological Achieve [2016B010124009]
  5. Zhuhai Top Discipline-Information Security
  6. Guangzhou Key Laboratory of Data Security and Privacy Preserving
  7. Guangdong Key Laboratory of Data Security and Privacy Preserving
  8. National Joint Engineering Research Center of Network Security Detection and Protection Technology

向作者/读者索取更多资源

Big multimedia data is heterogeneous in essence, that is, the data may be a mixture of video, audio, text, and images. This is due to the prevalence of novel applications in recent years, such as social media, video sharing, and location based services (LBS), etc. In many multimedia applications, for example, video/image tagging and multimedia recommendation, text classification techniques have been used extensively to facilitate multimedia data processing. In this paper, we give a comprehensive review on feature selection techniques for text classification. We begin by introducing some popular representation schemes for documents, and similarity measures used in text classification. Then, we review the most popular text classifiers, including Nearest Neighbor (NN) method, Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Neural Networks. Next, we survey four feature selection models, namely the filter, wrapper, embedded and hybrid, discussing pros and cons of the state-of-the-art feature selection approaches. Finally, we conclude the paper and give a brief introduction to some interesting feature selection work that does not belong to the four models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据