4.7 Article

AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation

Journal

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Volume 59, Issue 9, Pages 7871-7886

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2020.3034123

Keywords

Image segmentation; Simultaneous localization and mapping; Adaptive systems; Semantics; Optical imaging; Labeling; Convolutional neural networks; Attention mechanism; convolutional neural network (CNN); semantic segmentation

Funding

  1. National Natural Science Foundation of China [62036005]
  2. National Key Research and Development Program of China [2018YFB0505500, 2018YFB0505501]

Ask authors/readers for more resources

A novel adaptive fusion network (AFNet) is proposed to improve the performance of very high resolution (VHR) remote sensing image segmentation, utilizing scale-feature attention module (SFAM) and scale-layer attention module (SLAM) in a multilevel architecture. Extensive experiments demonstrate the effectiveness of the proposed model.
Semantic segmentation of remote sensing images plays an important role in many applications. However, a remote sensing image typically comprises a complex and heterogenous urban landscape with objects in various sizes and materials, which causes challenges to the task. In this work, a novel adaptive fusion network (AFNet) is proposed to improve the performance of very high resolution (VHR) remote sensing image segmentation. To coherently label size-varied ground objects from different categories, we design multilevel architecture with the scale-feature attention module (SFAM). By SFAM, at the location of small objects, low-level features from the shallow layers of convolutional neural network (CNN) are enhanced, whilst for large objects, high-level features from deep layers are enhanced. Thus, the features of size-varied objects could be preserved during fusing features from different levels, which helps to label size-varied objects. As for labeling the category with high intra-class difference and varied scales, the multiscale structure with a scale-layer attention module (SLAM) is utilized to learn representative features, where an adjacent score map refinement module (ACSR) is employed as the classifier. By SLAM, when fusing multiscale features, based on the interested objects scale, feature map from appropriate scale is given greater weights. With such a scale-aware strategy, the learned features can be more representative, which is helpful to distinguish objects for semantic segmentation. Besides, the performance is further improved by introducing several nonlinear layers to the ACSR. Extensive experiments conducted on two well-known public high-resolution remote sensing image data sets show the effectiveness of our proposed model. Code and predictions are available at https://github.com/athauna/AFNet/

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available