4.7 Article

Spectral-Spatial Transformer Network for Hyperspectral Image Classification: A Factorized Architecture Search Framework

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2021.3115699

关键词

Convolution; Transformers; Computer architecture; Task analysis; Training; Kernel; Hyperspectral imaging; Factorized architecture search (FAS); spatial attention; spectral association; spectral-spatial transformer network (SSTN)

资金

  1. China Postdoctoral Science Foundation [2020TQ0372, 2020M672964]
  2. Guangdong Natural Science Foundation [2021A1515011843]

向作者/读者索取更多资源

This study introduces a novel spectral-spatial transformer network (SSTN) to overcome the limitations of convolution kernels and proposes a factorized architecture search (FAS) framework that focuses on finding optimal architecture settings without the need for bilevel optimization. Experimental results demonstrate the excellent performance of SSTNs on multiple HSI benchmark datasets.
Neural networks have dominated the research of hyperspectral image classification, attributing to the feature learning capacity of convolution operations. However, the fixed geometric structure of convolution kernels hinders long-range interaction between features from distant locations. In this article, we propose a novel spectral-spatial transformer network (SSTN), which consists of spatial attention and spectral association modules, to overcome the constraints of convolution kernels. Also, we design a factorized architecture search (FAS) framework that involves two independent subprocedures to determine the layer-level operation choices and block-level orders of SSTN. Unlike conventional neural architecture search (NAS) that requires a bilevel optimization of both network parameters and architecture settings, the FAS focuses only on finding out optimal architecture settings to enable a stable and fast architecture search. Extensive experiments conducted on five popular HSI benchmarks demonstrate the versatility of SSTNs over other state-of-the-art (SOTA) methods and justify the FAS strategy. On the University of Houston dataset, SSTN obtains comparable overall accuracy to SOTA methods with a small fraction (1.2%) of multiply-and-accumulate operations compared to a strong baseline spectral-spatial residual network (SSRN). Most importantly, SSTNs outperform other SOTA networks using only 1.2% or fewer MACs of SSRNs on the Indian Pines, the Kennedy Space Center, the University of Pavia, and the Pavia Center datasets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据