4.6 Article

An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning

Journal

IEEE ACCESS
Volume 8, Issue -, Pages 144331-144342

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2020.3014355

Keywords

Biomedical imaging; Machine learning; Labeling; Semisupervised learning; Data models; Manuals; Diseases; High-quality data; biomedical engineering; active learning; deep learning

Funding

  1. Natural Science Foundation of China [61672535, 61502540, 61977062]
  2. National Science Foundation of Hunan Province, China [2019JJ20025, 2019JJ80031, 2019JJ40468, 2020JJ2069, 2019JJ40406]
  3. Natural Science Foundation of Changsha, China [kq2001041]
  4. Earmarked Fund for the China Agriculture Research System
  5. 111 Project [B18059]
  6. Opening Foundation of All-Solid-State Energy Storage Materials and Devices Key Laboratory of Hunan Province [2017TP1024]
  7. Key Laboratory of city computing and IoT of Hunan City University

Ask authors/readers for more resources

The rapid development of artificial intelligence has allowed deep learning technology to change our lives and has brought considerable convenience, but deep learning cannot succeed without a sufficient quantity and quality of data. In medical systems, due to the special nature of medical data resources, labeling and screening require professional input from doctors at considerable cost. However, if these data cannot be used effectively, then resources are wasted. To solve this problem, this paper proposes an effective high-quality medical lesion image data labeling method based on active learning, which labels the most representative and high-quality medical image data with artificial assistance. First, we generated subregions for all unlabeled images and predicted their classifications. Second, multifactor calculations were performed on all images. Finally, the values of multiple factors were used to sort all images, and the top-ranked images were selected and labeled with artificial assistance. The above steps were repeated until a suitable number of datasets had been labeled. The experimental results showed that a model trained on the labeled high-quality dataset could achieve the same quality as the model trained on all the data and save a considerable amount of time on manual labeling, which demonstrates the effectiveness of the method. The method ensures that the labeled data are valuable, high quality and rich in information to reduce the labeling workload and avoid wasting data resources.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available