Journal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Volume 44, Issue 2, Pages 579-590Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2019.2933510
Keywords
Visualization; Training; Detectors; Streaming media; Measurement; Feature extraction; Convolutional neural networks; Part localization network; part classification network; duplex focal loss; fine-grained visual categorization
Funding
- National Key Research and Development Plan of China [2018YFB1402600]
- National Science Foundation of China (NSFC) [61701415, 61772425, 61773315, 61790552]
- Young Star of Science and Technology in Shaanxi Province [2018KJXX-029]
- Fundamental Research Funds for the Central Universities [3102018zy023, 3102019AX09]
- Australian Research Council Future Fellowship [FT180100116]
Ask authors/readers for more resources
This paper proposes a new end-to-end fine-grained visual categorization system called P-CNN, which consists of three modules: SE block for recalibrating feature responses, PLN for locating object parts, and PCN for part classification. The paper also introduces new metric learning and part classification techniques. Experimental results demonstrate the effectiveness of this approach.
This paper proposes an end-to-end fine-grained visual categorization system, termed Part-based Convolutional Neural Network (P-CNN), which consists of three modules. The first module is a Squeeze-and-Excitation (SE) block, which learns to recalibrate channel-wise feature responses by emphasizing informative channels and suppressing less useful ones. The second module is a Part Localization Network (PLN) used to locate distinctive object parts, through which a bank of convolutional filters are learned as discriminative part detectors. Thus, a group of informative parts can be discovered by convolving the feature maps with each part detector. The third module is a Part Classification Network (PCN) that has two streams. The first stream classifies each individual object part into image-level categories. The second stream concatenates part features and global feature into a joint feature for the final classification. In order to learn powerful part features and boost the joint feature capability, we propose a Duplex Focal Loss used for metric learning and part classification, which focuses on training hard examples. We further merge PLN and PCN into a unified network for an end-to-end training process via a simple training technique. Comprehensive experiments and comparisons with state-of-the-art methods on three benchmark datasets demonstrate the effectiveness of our proposed method.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available