4.5 Article

Frame-by-frame annotation of video recordings using deep neural networks

期刊

ECOSPHERE
卷 12, 期 3, 页码 -

出版社

WILEY
DOI: 10.1002/ecs2.3384

关键词

animal‐ borne video; automated detection; deep learning; image classification; neural networks; video classification

类别

资金

  1. Homebrew Films
  2. Percy Fitzpatrick Institute of African Ornithology
  3. Scottish Government through the Marine Mammal Scientific Support Research Programme
  4. National Research Foundation of South Africa [90782, 105782]

向作者/读者索取更多资源

The study demonstrates an improvement in video classification by combining CNN and RNN, using two different datasets for illustration. It is recommended to include temporal information whenever manual inspection suggests that movement is predictive of class membership.
Video data are widely collected in ecological studies, but manual annotation is a challenging and time-consuming task, and has become a bottleneck for scientific research. Classification models based on convolutional neural networks (CNNs) have proved successful in annotating images, but few applications have extended these to video classification. We demonstrate an approach that combines a standard CNN summarizing each video frame with a recurrent neural network (RNN) that models the temporal component of video. The approach is illustrated using two datasets: one collected by static video cameras detecting seal activity inside coastal salmon nets and another collected by animal-borne cameras deployed on African penguins, used to classify behavior. The combined RNN-CNN led to a relative improvement in test set classification accuracy over an image-only model of 25% for penguins (80% to 85%), and substantially improved classification precision or recall for four of six behavior classes (12-17%). Image-only and video models classified seal activity with very similar accuracy (88 and 89%), and no seal visits were missed entirely by either model. Temporal patterns related to movement provide valuable information about animal behavior, and classifiers benefit from including these explicitly. We recommend the inclusion of temporal information whenever manual inspection suggests that movement is predictive of class membership.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据