4.6 Article

Large-scale support vector machine classification with redundant data reduction

期刊

NEUROCOMPUTING
卷 172, 期 -, 页码 189-197

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2014.10.102

关键词

Support vector machine; Classification; Clustering; Redundant data reduction

资金

  1. National Natural Science Foundation of China [61005017]
  2. Natural Science Foundation of the Jiangsu Higher Education Institutions of China [10KJB520005]
  3. Jiangsu University [1283000347]

向作者/读者索取更多资源

Large-scale image classification has shown great importance in object recognition and image retrieval as the vast amounts of social multimedia sharing on the networks. While the time and memory requirements for SVM training surge with an increase in the sample size, which makes SVM impractical even for a moderate problem as the number of training data reaches to the extent of hundreds of thousands. To solve this problem, many specially designed algorithms are proposed such as clustering-based SVM training which attempts to remove the clustered data points that lie far away from support vectors. In this paper, we further explore that there exist clustered and scattered data points in a cluster. The clustered data points that lie around the clustering centroid are the dense data points, which are in the inner layer of a cluster. Those data points are viewed as having no SVs and removed. While the scattered data points are the sparse data points in the outside layer of a cluster. Those data points are viewed as having SVs and thus reserved. The Fisher Discriminant Ratio is employed to determine a boundary between the clustered and scattered data points in one cluster, which is computed based on the distance densities of data points to the cluster centroid. The redundant clustered data points in clusters are thus removed to speed up SVM training process. Several experimental results show that our proposed method has good classification accuracy while training time is significantly reduced. The training time in our proposed method only accounts for about 17 percent of the time in LIBSVM on the large data set of Covertype. (C) 2015 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据