4.5 Article

A multiscale 3D convolution with context attention network for hyperspectral image classification

Journal

EARTH SCIENCE INFORMATICS
Volume 15, Issue 4, Pages 2553-2569

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s12145-022-00858-9

Keywords

Hyperspectral image classification; Attention mechanism; Convolutional neural network; Deep learning

Funding

  1. Research Foundation of Education Department of Sichuan Province [14ZB0282]
  2. Research Foundation of Yibin University [2019PY37]

Ask authors/readers for more resources

In this paper, a multiscale 3D convolution with context attention network is proposed for HSI classification. The method introduces convolution kernels of different sizes to enlarge the receptive field and adaptively detect HSI features in different scales. Two subnetworks are built to efficiently exploit hierarchical spectral and spatial features and enhance feature transmission. Experimental results show that the proposed method outperforms state-of-the-art models on multiple benchmark HSI datasets in terms of overall accuracy.
Deep learning, especially 3D convolutional neural networks (CNNs), has been proved to be an excellent feature extractor in the hyperspectral image (HSI) classification. However, simply accumulating conventional 3D convolution units and blindly increasing the depth of the network does not improve the model performance effectively. Besides, most deep learning models tend to struggle due to the serious overfitting problem under the condition of small sample, this seriously restricts the accuracy of model classification. To solve the abovementioned problems, we proposed a multiscale 3D convolution with context attention network for HSI classification. Specifically, we introduce a multiscale 3D convolution composed of convolution kernels of different sizes to replace the conventional 3D convolution to enlarge the receptive field and adaptively detect the HSI features in different scales. Then, based on multiscale 3D convolution, we build two subnetworks to efficiently exploit hierarchical spectral and spatial features respectively, and enhance the transmission of features. Finally, to explore the discriminative features further, we design two types of attention mechanisms (AM) to build compact relationships between each position\channel and aggregation center instead of model any position\channel and position\channel relationships. After each 3D convolution layer, a compact AM is adopted to refine extracted hierarchical spectral and spatial features respectively, and boost the performance of the model. Experiments were conducted on four benchmark HSI datasets, the results demonstrate that the proposed method outperforms state-of-the-art models with the overall accuracy of 96.39%, 97.83%, 98.58%, and 97.98% over Indian Pines, Salinas Valley, Pavia University and Botswana dataset, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available