期刊
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
卷 60, 期 -, 页码 -出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2021.3072488
关键词
Convolutional neural networks (CNNs); feature pyramid (FP) networks; Laplacian FP; object detection; very high resolution optical remote sensing (VHR-ORS) images
类别
资金
- Key Scientific Technological Innovation Research Project by Ministry of Education
- Foundation for Innovative Research Groups of the National Natural Science Foundation of China [61621005]
- National Natural Science Foundation of China [U1701267, 61906093, 61573267, 61906150]
- Fund for Foreign Scholars in University Research and Teaching Program's 111 Project [B07048]
- Major Research Plan of the National Natural Science Foundation of China [91438201, 91438103]
- Fundamental Research Funds for the Central Universities [30919011279, JBF201905]
- CAAI-Huawei MindSpore Open Fund
The paper introduces a novel Laplacian Feature Pyramid Network (LFPN) that combines low-frequency and high-frequency features to enhance object detection performance in VHR-ORS images. High-frequency features, crucial for distinguishing ground objects, have not been adequately addressed in previous studies.
Except for multiscale features, high-frequency features are also crucial for the identification of many objects in object detection for very high resolution optical remote sensing (VHR-ORS) images but have not been considered yet. Due to the fact that the Laplacian pyramid consists of high-frequency information at each level, we propose a Laplacian feature pyramid (FP) network (LFPN) considering both low-frequency features and high-frequency features based on FP structure to improve the object detection performance of VHR-ORS images. FP-based structures are efficient to represent multiscale features. But, in general, FP-based structures, high-frequency features are not specially considered. Such high-frequency features are important to distinguish many ground objects with sufficient details. For example, texture features are critical to distinguish basketball_court and tennis_court. The construction of LFPN consists of a bottom-up pathway, Laplacian pathway, and a fusion pathway, which generate low-frequency pyramid, high-frequency pyramid, and compound pyramid, respectively. The bottom-up pathway follows the computation flow of the backbone convolutional neural networks (CNNs) which is similar to general FP-based structures. The Laplacian pathway extracts the high-frequency features of objects through a trainable Laplacian operator. Finally, the low-frequency and high-frequency FPs are fused to generate the compound pyramid in efficient ways. To evaluate the performance of LFPN, we embed LFPN into both two-stage object detection (T-LFPN) systems and single-stage object detection (S-LFPN) systems to conduct experiments. Experiments on a public challenging ten-class data set NWPU VHR-10 demonstrate the superior performance of LFPN in both T-LFPN and S-LFPN systems and state-of-the-art performance of LFPN-based detectors.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据