4.6 Article

A part-based spatial and temporal aggregation method for dynamic scene recognition

期刊

NEURAL COMPUTING & APPLICATIONS
卷 33, 期 13, 页码 7353-7370

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s00521-020-05415-3

关键词

Dynamic scene recognition; Feature aggregation; Deep neural networks; Part-based models

资金

  1. Australian Research Council (ARC)
  2. University of Wollongong

向作者/读者索取更多资源

In this paper, a part-based method is proposed for dynamic scene recognition, which aggregates local features from video frames. Experimental results demonstrate that the proposed method is highly competitive with state-of-the-art approaches.
Existing methods for dynamic scene recognition mostly use global features extracted from the entire video frame or a video segment. In this paper, a part-based method is proposed to aggregate local features from video frames. A pre-trained Fast R-CNN model is used to extract local convolutional features from the regions of interest of training images. These features are clustered to locate representative parts. A set cover problem is then formulated to select the discriminative parts, which are further refined by fine-tuning the Fast R-CNN model. Local features from a video segment are extracted at different layers of the fine-tuned Fast R-CNN model and aggregated both spatially and temporally. Extensive experimental results show that the proposed method is very competitive with state-of-the-art approaches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据