3.8 Proceedings Paper

MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/WACV48630.2021.00110

Keywords

-

Funding

  1. Intelligence Advanced Research Projects Activity (IARPA) [2017-16110300001]

Ask authors/readers for more resources

The MEVA dataset is a large-scale dataset for human activity recognition, consisting of over 9300 hours of untrimmed, continuous video with diverse activities and background activities. Annotated with 37 activity types and bounding boxes of actors and props, the dataset includes multiple modalities such as RGB and thermal IR cameras, UAV footage, and GPS locations for actors. With data collected under IRB oversight and approval, the CC-BY-4.0 licensed dataset also includes additional ground camera data, UAV data, and GPS logs.
We present the Multiview Extended Video with Activities (MEVA) dataset[6], a new and very-large-scale dataset for human activity recognition. Existing security datasets either focus on activity counts by aggregating public video disseminated due to its content, which typically excludes same-scene background video, or they achieve persistence by observing public areas and thus cannot control for activity content. Our dataset is over 9300 hours of untrimmed, continuous video, scripted to include diverse, simultaneous activities, along with spontaneous background activity. We have annotated 144 hours for 37 activity types, marking bounding boxes of actors and props. Our collection observed approximately 100 actors performing scripted scenarios and spontaneous background activity over a three-week period at access-controled venue, collecting in multiple modalities with overlapping and non-overlapping indoor and outdoor viewpoints. The resulting data includes video from 38 RGB and thermal IR cameras, 42 hours of UAV footage, as well as GPS locations for the actors. 122 hours of annotation are sequestered in support of the NIST Activity in Extended Video (ActEV) challenge; the other 22 hours of annotation and the corresponding video are available on our website, along with an additional 306 hours of ground camera data, 4.6 hours of UAV data, and 9.6 hours of GPS logs. Additional derived data includes camera models geo-registering the outdoor cameras and a dense 3D point cloud model of the outdoor scene. The data was collected with IRB oversight and approval and released under a CC-BY-4.0 license.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available