☆ 4.6 Article

Large-scale support vector machine classification with redundant data reduction

NEUROCOMPUTING (2016)

期刊

NEUROCOMPUTING

卷 172, 期 -, 页码 189-197

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2014.10.102

关键词

Support vector machine; Classification; Clustering; Redundant data reduction

类别

Computer Science, Artificial Intelligence

资金

National Natural Science Foundation of China [61005017]
Natural Science Foundation of the Jiangsu Higher Education Institutions of China [10KJB520005]
Jiangsu University [1283000347]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Large-scale image classification has shown great importance in object recognition and image retrieval as the vast amounts of social multimedia sharing on the networks. While the time and memory requirements for SVM training surge with an increase in the sample size, which makes SVM impractical even for a moderate problem as the number of training data reaches to the extent of hundreds of thousands. To solve this problem, many specially designed algorithms are proposed such as clustering-based SVM training which attempts to remove the clustered data points that lie far away from support vectors. In this paper, we further explore that there exist clustered and scattered data points in a cluster. The clustered data points that lie around the clustering centroid are the dense data points, which are in the inner layer of a cluster. Those data points are viewed as having no SVs and removed. While the scattered data points are the sparse data points in the outside layer of a cluster. Those data points are viewed as having SVs and thus reserved. The Fisher Discriminant Ratio is employed to determine a boundary between the clustered and scattered data points in one cluster, which is computed based on the distance densities of data points to the cluster centroid. The redundant clustered data points in clusters are thus removed to speed up SVM training process. Several experimental results show that our proposed method has good classification accuracy while training time is significantly reduced. The training time in our proposed method only accounts for about 17 percent of the time in LIBSVM on the large data set of Covertype. (C) 2015 Elsevier B.V. All rights reserved.

Large-scale support vector machine classification with redundant data reduction

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Large-scale support vector machine classification with redundant data reduction

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文