4.7 Article

Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images

Journal

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING
Volume 146, Issue -, Pages 182-196

Publisher

ELSEVIER
DOI: 10.1016/j.isprsjprs.2018.09.014

Keywords

Multi-class geospatial object detection; Deep networks; Scene-level supervision; Discriminative convolutional weights; Class-specific activation weights

Funding

  1. National Key Research and Development Program of China [2018YFB0505003]
  2. National Natural Science Foundation of China [41601352, 41322010]
  3. China Postdoctoral Science Foundation [2016M590716, 2017T100581]
  4. Hubei Provincial Natural Science Foundation of China [2018CFB501]

Ask authors/readers for more resources

Due to its many applications, multi-class geospatial object detection has attracted increasing research interest in recent years. In the literature, existing methods highly depend on costly bounding box annotations. Based on the observation that scene-level tags provide important cues for the presence of objects, this paper proposes a weakly supervised deep learning (WSDL) method for multi-class geospatial object detection using scene-level tags only. Compared to existing WSDL methods which take scenes as isolated ones and ignore the mutual cues between scene pairs when optimizing deep networks, this paper exploits both the separate scene category information and mutual cues between scene pairs to sufficiently train deep networks for pursuing the superior object detection performance. In the first stage of our training method, we leverage pair-wise scene-level similarity to learn discriminative convolutional weights by exploiting the mutual information between scene pairs. The second stage utilizes point-wise scene-level tags to learn class-specific activation weights. While considering that the testing remote sensing image generally covers a large region and may contain a large number of objects from multiple categories with large size variations, a multi-scale scene-sliding-voting strategy is developed to calculate the class-specific activation maps (CAM) based on the aforementioned weights. Finally, objects can be detected by segmenting the CAM. The deep networks are trained on a seemingly unrelated remote sensing image scene classification dataset. Additionally, the testing phase is conducted on a publicly open multi-class geospatial object detection dataset. The experimental results demonstrate that the proposed deep networks dramatically outperform the state-of-the-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available