4.6 Article

Using Under-Trained Deep Ensembles to Learn Under Extreme Label Noise: A Case Study for Sleep Apnea Detection

Journal

IEEE ACCESS
Volume 9, Issue -, Pages 45919-45934

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3067455

Keywords

Labeling; Training; Sleep apnea; Task analysis; Noise measurement; Reliability; Predictive models; Biomedical informatics; supervised learning; sleep apnea; machine learning; label noise

Funding

  1. Research Council of Norway through the CESAR project [250239/O70]

Ask authors/readers for more resources

Improper or erroneous labelling can hinder reliable generalization for supervised learning, especially in critical fields like healthcare. This study presents an effective approach for learning under extreme label noise in medical applications, utilizing under-trained deep ensembles to improve generalization. Performance improvement from 0.02 to 0.55 was observed in sleep apnea detection tasks.
Improper or erroneous labelling can pose a hindrance to reliable generalization for supervised learning. This can have negative consequences, especially for critical fields such as healthcare. We propose an effective new approach for learning under extreme label noise for medical applications like sleep apnea, that is based on under-trained deep ensembles. Each ensemble member is trained with a subset of the training data, to acquire a general overview of the decision boundary separation, without focusing on potentially erroneous details. The accumulated knowledge of the ensemble is combined to form new labels, that determine a better class separation than the original labels. A new model is trained with these labels to generalize reliably despite the label noise. We evaluate our approach on the tasks of sleep apnea detection and sleep apnea severity classification, and observe performance improvement in kappa from 0.02 up-to 0.55.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available