4.7 Article

SpectralSpatial Feature Tokenization Transformer for Hyperspectral Image Classification

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2022.3144158

关键词

Feature extraction; Transformers; Convolution; Semantics; Principal component analysis; Data mining; Convolutional neural networks; Convolutional neural networks (CNNs); hyperspectral image (HSI) classification; semantic features; spectral-spatial tokenization; transformer

资金

  1. National Natural Science Foundation of China [61971233, 62076137, U20B2065, U20B2061]
  2. Natural Science Foundation of Jiangsu Province [BK 20211539]
  3. Henan Key Laboratory of Food Safety Data Intelligence [KF2020ZD01]
  4. Postgraduate Research & Practice Innovation Program of Jiangsu Province [KYCX21_1004]

向作者/读者索取更多资源

In this article, the spectral-spatial feature tokenization transformer (SSFTT) method is proposed to capture spectral-spatial and high-level semantic features. Experimental analysis confirms that this method outperforms other deep learning methods in terms of computation time and classification performance.
In hyperspectral image (HSI) classification, each pixel sample is assigned to a land-cover category. In the recent past, convolutional neural network (CNN)-based HSI classification methods have greatly improved performance due to their superior ability to represent features. However, these methods have limited ability to obtain deep semantic features, and as the layer & x2019;s number increases, computational costs rise significantly. The transformer framework can represent high-level semantic features well. In this article, a spectral & x2013;spatial feature tokenization transformer (SSFTT) method is proposed to capture spectral & x2013;spatial features and high-level semantic features. First, a spectral & x2013;spatial feature extraction module is built to extract low-level features. This module is composed of a 3-D convolution layer and a 2-D convolution layer, which are used to extract the shallow spectral and spatial features. Second, a Gaussian weighted feature tokenizer is introduced for features transformation. Third, the transformed features are input into the transformer encoder module for feature representation and learning. Finally, a linear layer is used to identify the first learnable token to obtain the sample label. Using three standard datasets, experimental analysis confirms that the computation time is less than other deep learning methods and the performance of the classification outperforms several current state-of-the-art methods. The code of this work is available at https://github.com/zgr6010/HSI_SSFTT for the sake of reproducibility.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据