4.5 Article

A boosting Self-Training Framework based on Instance Generation with Natural Neighbors forKNearest Neighbor

Journal

APPLIED INTELLIGENCE
Volume 50, Issue 11, Pages 3535-3553

Publisher

SPRINGER
DOI: 10.1007/s10489-020-01732-1

Keywords

Semi-supervised learning (SSL); Semi-supervised classification (SSC); Self-training; Boosting; Instance generation; Natural neighbors

Funding

  1. National Natural Science Foundation of China [61272194, 61502060]
  2. Project of Chongqing Natural Science Foundation [cstc2019jcyj-msxmX0683]

Ask authors/readers for more resources

The semi-supervised self-training method is one of the successful methodologies of semi-supervised classification. The mislabeling is the most challenging issue in self-training methods and the ensemble learning is one of the common techniques for dealing with the mislabeling. Specifically, the ensemble learning can solve or alleviate the mislabeling by constructing an ensemble classifier to improve prediction accuracy in the self-training process. However, most ensemble learning methods may not perform well in self-training methods because it is difficult for ensemble learning methods to train an effective ensemble classifier with a small number of labeled data. Inspired by the successful boosting methods, we introduce a new boosting self-training framework based on instance generation with natural neighbors (BoostSTIG) in this paper. BoostSTIG is compatible with most boosting methods and self-training methods. It can use most boosting methods to solve or alleviate the mislabeling of existing self-training methods by improving the prediction accuracy in the self-training process. Besides, an instance generation with natural neighbors is proposed to enlarge initial labeled data in BoostSTIG, which makes boosting methods more suitable for self-training methods. In experiments, we apply the BoostSTIG framework to 2 self-training methods and 4 boosting methods, and then validate BoostSTIG by comparing some state-of-the-art technologies on real data sets. Intensive experiments show that BoostSTIG can improve the performance of tested self-training methods and train an effectiveknearest neighbor.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available