☆ 4.7 Article

High-Level Semantic Networks for Multi-Scale Object Detection

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2020)

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

卷 30, 期 10, 页码 3372-3386

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSVT.2019.2950526

关键词

Semantics; Feature extraction; Object detection; Proposals; Face detection; Convolution; Face; Object detection; multi-branch network; high-level semantic features; receptive field

类别

Engineering, Electrical & Electronic

资金

Science and Technology Innovation 2030-Major Project of Artificial Intelligence of the Ministry of Science and Technology of China [2018AAA01028]
National Natural Science Foundation of China [61632018, 61906131, 61936014, 61871470]
Post-Doctoral Program for Innovative Talents [BX20180214]
China Post-Doctoral Science Foundation [2018M641647]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

To better solve scale variance problem, deep multi-scale methods usually detect objects of different scales by different in-network layers. However, the semantic levels of features from different layers are usually inconsistent. In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches. As a result, the different branches have same depth and the output features of different branches have similarly high-level semantics. Due to the difference of receptive fields, the different branches are suitable to detect objects of different scales. Meanwhile, the multi-branch network does not introduce additional parameters by sharing the convolutional weights of different branches. To further improve detection performance, skip-layer connections are used to add context to the branch of relatively small receptive field, and dilated convolution is incorporated to enlarge the resolutions of output feature maps. When they are embedded into Faster RCNN architecture, the weighted scores of proposal generation network and proposal classification network are further proposed. Experiments on three pedestrian datasets (i.e., the KITTI dataset, the Caltech dataset, and the Citypersons dataset), one face dataset (i.e., the WIDER FACE dataset), and two general object datasets (i.e., the COCO benchmark and the PASCAL VOC dataset) demonstrate the effectiveness and generality of proposed method. On these datasets, our method achieves state-of-the-art performance.

High-Level Semantic Networks for Multi-Scale Object Detection

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

High-Level Semantic Networks for Multi-Scale Object Detection

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文