4.7 Article

Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization

Journal

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Volume 59, Issue 8, Pages 6549-6561

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2020.3026221

Keywords

Invariant feature representation; symmetric positive definite (SPD) manifold; Stiefel manifold and aerial scene categorization

Funding

  1. EPSRC DERC: Digital Economy Research Centre [EP/M023001/1]

Ask authors/readers for more resources

The proposed method, IDCCP, effectively tackles nuisance variations in aerial scene categorization by transforming input images and transferring group structure to the representation space. By extending the representation to tensor space and imposing orthogonal constraints, the model significantly reduces feature dimensions without sacrificing accuracy. Experiments show the superiority of the IDCCP model compared to state-of-the-art methods in publicly released aerial scene image data sets.
Learning discriminative and invariant feature representation is the key to visual image categorization. In this article, we propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization. We consider transforming the input image according to a finite transformation group that consists of multiple confounding orthogonal matrices, such as the D4 group. Then, we adopt a Siamese-style network to transfer the group structure to the representation space, where we can derive a trivial representation that is invariant under the group action. The linear classifier trained with trivial representation will also be possessed with invariance. To further improve the discriminative power of representation, we extend the representation to the tensor space while imposing orthogonal constraints on the transformation matrix to effectively reduce feature dimensions. We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods. In particular, with using ResNet architecture, our IDCCP model can reduce the dimension of the tensor representation by about 98% without sacrificing accuracy (i.e., <0.5%).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available