Journal
2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019)
Volume -, Issue -, Pages 4963-4966Publisher
IEEE
DOI: 10.1109/igarss.2019.8897927
Keywords
single-view semantic 3D challenge; pyramid on pyramid; semantic segmentation; height estimation
Categories
Ask authors/readers for more resources
The single-view semantic 3D challenge in 2019 Data Fusion Contest is to predict both semantic labels and normalized digital surface model (nDSM) for urban scenes from single-view satellite images. We propose a novel pyramid on pyramid network (Pop-Net) based on Encoder-Dual Decoder framework to end-to-end multi-task learning. The encoder is a deformable ResNet-101 backbone network. Two feature pyramid networks, as decoders, are responsible for semantic segmentation and height estimation, respectively. Semantic information is crucial to estimate height. Therefore, regression pyramid on the semantic pyramid is introduced to leverage semantic features to help height estimation. To deal with outliers in heights, we leverage anchor-based regression and smooth L1 loss for optimization to obtain more robust height estimation. Without bells and whistles, our single model entry achieves 77.78% mIoU and 53.40% mIoU-3 on test set, ranking 2nd in the Single-view Semantic 3D Challenge of the 2019 IEEE GRSS Data Fusion Contest. The code is available at https://github.com/Z-Zheng/PopNet.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available