☆ 4.7 Article

Feature flow: In-network feature flow estimation for video object detection

PATTERN RECOGNITION (2022)

期刊

PATTERN RECOGNITION

卷 122, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2021.108323

关键词

Video object detection; Feature flow; Object detection; Video analysis; Deep convolutional neural network (DCNN)

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces a novel network structure (IFF-Net) with an In-network Feature Flow estimation module (IFF module) for video object detection, which can directly produce feature-level motion information without pre-training on additional datasets. It efficiently and accurately detects objects, and further improves performance through a transformation residual loss (TRL).

Optical flow, which expresses pixel displacement, is widely used in many computer vision tasks to pro -vide pixel-level motion information. However, with the remarkable progress of the convolutional neu-ral network, recent state-of-the-art approaches are proposed to solve problems directly on feature-level. Since the displacement of feature vector is not consistent with the pixel displacement, a common ap-proach is to forward optical flow to a neural network and fine-tune this network on the task dataset. With this method, they expect the fine-tuned network to produce tensors encoding feature-level motion information. In this paper, we rethink about this de facto paradigm and analyze its drawbacks in the video object detection task. To mitigate these issues, we propose a novel network (IFF-Net) with an In-network Feature Flow estimation module (IFF module) for video object detection. Without resorting to pre-training on any additional dataset, our IFF module is able to directly produce feature flow which in-dicates the feature displacement. Our IFF module consists of a shallow module, which shares the features with the detection branches. This compact design enables our IFF-Net to accurately detect objects, while maintaining a fast inference speed. Furthermore, we propose a transformation residual loss (TRL) based on self-supervision, which further improves the performance of our IFF-Net. Our IFF-Net outperforms ex-isting methods and achieves new state-of-the-art performance on ImageNet VID. (c) 2021 Elsevier Ltd. All rights reserved.

Feature flow: In-network feature flow estimation for video object detection

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Feature flow: In-network feature flow estimation for video object detection

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文