4.5 Article

Semantic convolutional features for face detection

期刊

MACHINE VISION AND APPLICATIONS
卷 33, 期 1, 页码 -

出版社

SPRINGER
DOI: 10.1007/s00138-021-01245-y

关键词

Face detection; Feature enhancement; Feature pyramid network

资金

  1. Vietnam National Foundation for Science and Technology Development (NAFOSTED) [102.05-2020.02]

向作者/读者索取更多资源

This paper proposes a novel feature pyramid fashion to generate semantic features at all levels of a convolutional neural network for face detection. The proposed model achieves promising results in terms of both processing time and detection accuracy by merging features from different layers and incorporating new objective functions.
Convolutional neural networks have been extensively used as the key role to address many computer vision applications. Traditionally, learning convolutional features is performed in a hierarchical manner along the dimension of network depth to create multi-scale feature maps. As a result, strong semantic features are derived at the top-level layers only. This paper proposes a novel feature pyramid fashion to produce semantic features at all levels of the network for specially addressing the problem of face detection. Particularly, a Semantic Convolutional Box (SCBox) is presented by merging the features from different layers in a bottom-up fashion. The proposed lightweight detector is stacked of alternating SCBox and Inception residual modules to learn the visual features in both the dimensions of network depth and width. In addition, the newly introduced objective functions (e.g., focal and CIoU losses) are incorporated to effectively address the problem of unbalanced data, resulting in stable training. The proposed model has been validated on the standard benchmarks FDDB and WIDER FACES, in comparison with the state-of-the-art methods. Experiments showed promising results in terms of both processing time and detection accuracy. For instance, the proposed network achieves an average precision of 96.8% on FDDB, 82.4% on WIDER FACES, and gains an inference speed of 106 FPS on a moderate GPU configuration or 20 FPS on a CPU machine.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据