4.6 Article

Applying Efficient Selection Techniques of Unlabeled Instances for Wrapper-Based Semi-Supervised Methods

Journal

IEEE ACCESS
Volume 10, Issue -, Pages 43535-43551

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3169498

Keywords

Classification algorithms; Labeling; Prediction algorithms; Measurement; Training; Semisupervised learning; Machine learning; Artificial intelligence; machine learning; semi-supervised learning; self-training semi-supervised method; co-training semi-supervised method

Funding

  1. Federal University of Rio Grande do Norte
  2. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES), Brazil [001]

Ask authors/readers for more resources

Semi-supervised learning (SSL) is a machine learning approach that integrates supervised and unsupervised learning mechanisms. This paper focuses on the use of a wrapper-based strategy in SSL and proposes three selection methods for efficient selection of unlabelled instances. The feasibility of these methods is evaluated through empirical analysis on two well-known SSL methods: Self-training and Co-training.
Semi-supervised learning (SSL) is a machine learning approach that integrates supervised and unsupervised learning mechanisms. This integration may be done in different ways and one possibility is to use a wrapper-based strategy. The main aim of a wrapper-based strategy is to use a small number of labelled instances to create a learning model. Then, this created model is used in a labelling process, where some unlabelled instances are labelled, and consequently, these instances are incorporated into the labelled set. One important aspect of a wrapper-based SSL method is the selection of unlabelled instances to be labelled in the labelling process. In other words, an efficient selection process plays an important role in the design of a wrapper-based SSL method since it can lead to an efficient labelling process, and in turn, the creation of efficient learning models. In this paper, we propose the use of three selection methods that can be applied to wrapper-based SSL methods. The main idea is to use two different selection criteria, prediction confidence or classification agreement with a distance metric, to perform an efficient selection of the unlabelled instances. In order to assess the feasibility of the proposed approach, the selection methods are applied in two well-known wrapper-based SSL methods, which are: Self-training and Co-training. Additionally, an empirical analysis will be conducted in which we compare the standard Self-training and Co-training methods against the proposed versions of these two SSL methods over 35 classification datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available