4.7 Article

Searching for CNN Architectures for Remote Sensing Scene Classification

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2021.3097938

Keywords

Computer architecture; Task analysis; Search problems; Computational modeling; Convolution; Transfer learning; Spatial resolution; Convolutional neural networks (CNNs); neural architecture search (NAS); random search; remote sensing (RS) scene classification

Funding

  1. Makiguchi Foundation

Ask authors/readers for more resources

In transfer learning, pretrained CNN models based on ImageNet may not be suitable for remote sensing image scene classification, while our proposed SLGE random search method demonstrates excellent capability in classifying multispectral satellite image scenes.
Convolutional neural network (CNN) models for remote sensing (RS) scene classification are largely built on pretrained networks that are trained on the general-purpose ImageNet dataset in computer vision. The pretrained networks can easily be adapted for transfer learning in RS scene classification. However, the accuracy of transfer learning may decline as RS images are considerably different from other images. Thus, the pretrained CNN model learned on ImageNet may not be sufficient for the accurate classification of RS image scenes. Furthermore, most of the pretrained models have large memory footprints, which place a further burden on computational requirements. In this work, we explore SLGE-based random search with early stopping in the search for CNN architectures for both single-label and multilabel RS scene classification tasks. In SLGE, the architecture search space is capable of representing multipath Inception-like modular cells with skip-connections similar to human-expert designs. The experimental results on four RS scene classification benchmarks show that the automatically discovered networks demonstrate the promising capability in classifying multispectral satellite image scenes compared with fine-tuned pretrained CNN models. Using fewer parameters with 0.56B FLOPS, our best network achieves a classification accuracy rate of 96.56% and 96.10% on NWPU-RESISC45 single-label and AID single-label RGB aerial image datasets, respectively, and the classification accuracy rate of 99.76% and 93.89% on EuroSAT single-label and BigEarthNet multilabel multispectral satellite image datasets, respectively. The results position our approach among the best of the state of the art.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available