☆ 4.6 Article

Learning k for kNN Classification

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY (2017)

Journal

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

Volume 8, Issue 3, Pages -

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/2990508

Keywords

kNN method; sparse learning; missing data imputation

Funding

China Key Research Program [2016YFB1000905]
National Natural Science Foundation of China [61263035, 61573270, 61672177]
China 973 Program [2013CB329404]
Guangxi Natural Science Foundation [2015GXNSFCB139011]
Guangxi Higher Institutions' Program of Introducing 100 High-Level Overseas Talents
Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing
Guangxi Bagui Teams for Innovation and Research

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The K Nearest Neighbor (kNN) method has widely been used in the applications of data mining and machine learning due to its simple implementation and distinguished performance. However, setting all test data with the same k value in the previous kNN methods has been proven to make these methods impractical in real applications. This article proposes to learn a correlation matrix to reconstruct test data points by training data to assign different k values to different test data points, referred to as the Correlation Matrix kNN (CM-kNN for short) classification. Specifically, the least-squares loss function is employed to minimize the reconstruction error to reconstruct each test data point by all training data points. Then, a graph Laplacian regularizer is advocated to preserve the local structure of the data in the reconstruction process. Moreover, an l(1)-norm regularizer and an l(2,1)-norm regularizer are applied to learn different k values for different test data and to result in low sparsity to remove the redundant/noisy feature from the reconstruction process, respectively. Besides for classification tasks, the kNN methods (including our proposed CM-kNN method) are further utilized to regression and missing data imputation. We conducted sets of experiments for illustrating the efficiency, and experimental results showed that the proposed method was more accurate and efficient than existing kNN methods in data-mining applications, such as classification, regression, and missing data imputation.

Learning k for kNN Classification

Journal

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Learning k for kNN Classification

Journal

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper