☆ 3.8 Proceedings Paper

MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021) (2021)

Journal

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021)

Volume -, Issue -, Pages 1059-1067

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/WACV48630.2021.00110

Keywords

Funding

Intelligence Advanced Research Projects Activity (IARPA) [2017-16110300001]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The MEVA dataset is a large-scale dataset for human activity recognition, consisting of over 9300 hours of untrimmed, continuous video with diverse activities and background activities. Annotated with 37 activity types and bounding boxes of actors and props, the dataset includes multiple modalities such as RGB and thermal IR cameras, UAV footage, and GPS locations for actors. With data collected under IRB oversight and approval, the CC-BY-4.0 licensed dataset also includes additional ground camera data, UAV data, and GPS logs.

We present the Multiview Extended Video with Activities (MEVA) dataset[6], a new and very-large-scale dataset for human activity recognition. Existing security datasets either focus on activity counts by aggregating public video disseminated due to its content, which typically excludes same-scene background video, or they achieve persistence by observing public areas and thus cannot control for activity content. Our dataset is over 9300 hours of untrimmed, continuous video, scripted to include diverse, simultaneous activities, along with spontaneous background activity. We have annotated 144 hours for 37 activity types, marking bounding boxes of actors and props. Our collection observed approximately 100 actors performing scripted scenarios and spontaneous background activity over a three-week period at access-controled venue, collecting in multiple modalities with overlapping and non-overlapping indoor and outdoor viewpoints. The resulting data includes video from 38 RGB and thermal IR cameras, 42 hours of UAV footage, as well as GPS locations for the actors. 122 hours of annotation are sequestered in support of the NIST Activity in Extended Video (ActEV) challenge; the other 22 hours of annotation and the corresponding video are available on our website, along with an additional 306 hours of ground camera data, 4.6 hours of UAV data, and 9.6 hours of GPS logs. Additional derived data includes camera models geo-registering the outdoor cameras and a dense 3D point cloud model of the outdoor scene. The data was collected with IRB oversight and approval and released under a CC-BY-4.0 license.

MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection

Journal

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection

Journal

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper