4.6 Article

Ensemble Learners of Multiple Deep CNNs for Pulmonary Nodules Classification Using CT Images

Journal

IEEE ACCESS
Volume 7, Issue -, Pages 110358-110371

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2933670

Keywords

Lung cancer; pulmonary nodules; CT; machine learning; ensemble learning; convolutional neural networks

Funding

  1. National Natural Science Foundation of China [81671773, 61672146]
  2. Fundamental Research Funds for the Central Universities [N172008008, N180719020]
  3. Open Program of Neusoft Research of Intelligent Healthcare Technology Company Ltd. [NRIHTOP1803]

Ask authors/readers for more resources

Various deep convolutional neural networks (CNNs) have been used to distinguish between benign and malignant pulmonary nodules using CT images. However, single learner usually presents unsatisfied performance due to limited hypothesis space, or falling into local minima, or wrong selection of hypothesis space. To tackle these issues, we propose to build ensemble learners through fusing multiple deep CNN learners for pulmonary nodules classification. CT image patches of 743 nodules are extracted from LIDC-IDRI database and utilized. First, eight deep CNN learners with different architectures are trained and evaluated by 10-fold cross-validation. Each nodule has eight predictions from the eight primary learners. Second, we fuse these eight predictions by the strategies of majority voting (VOT), averaging (AVE), or machine learning. Specifically, different machine learning algorithms including K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Naive Bayes (NB), Decision Trees (DT), Multi-layer Perceptron (MLP), Random Forests (RF), Gradient Boosting Regression Trees (GBRT) and Adaptive Boosting (AdaBoost) are implemented. Moreover, the correlation coefficients between the predictions of 10 ensemble learners are calculated, and the hierarchical clustering dendrogram is drawn. It is found that the ensemble learners achieve higher prediction accuracy (84.0% vs 81.7%) than single CNN learner. The overlap ratio among the 10 ensemble learners is much higher than that of the 8 primary learners (62.9% vs 33.2%). In addition, it is shown that ensemble learners are roughly divided into three categories: the first (SVM, MLP, GBRT and RF) achieves the best performance; the second (VOT and AVE) is better than the third (AdaBoost, DT, NB and KNN). VOT and AVE yield higher recall than the machine learning algorithms. These results indicate that ensemble learners based on multiple CNN learners can achieve better performances for pulmonary nodules classification using CT images and that preferred fusion strategies include SVM, MLP, GBRT and RF.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available