4.5 Article

NOVAction23: Addressing the data diversity gap by uniquely generated synthetic sequences for real-world human action recognition

期刊

COMPUTERS & GRAPHICS-UK
卷 118, 期 -, 页码 1-10

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.cag.2023.10.011

关键词

Human action recognition; Data diversity gap; Synthetic data; Procedural generation

向作者/读者索取更多资源

By developing the NOVAction engine, we have created the NOVAction23 dataset, which consists of highly diversified and photorealistic synthetic human action sequences. This dataset is significant in improving the performance of human action recognition.
Recognition of human actions using machine learning requires extensive datasets to develop robust models. Nevertheless, obtaining real-world data presents challenges due to the costly and time-consuming process involved. Additionally, existing datasets mostly contain indoor videos due to the challenges of capturing pose data outdoors. Synthetic data have been used to overcome these difficulties, yet the currently available synthetic datasets for human action recognition lack photorealism and diversity in their features. Addressing these shortcomings, we develop the NOVAction engine to generate highly diversified and photorealistic synthetic human action sequences. We use NOVAction to create the NOVAction23 dataset comprising 25,415 human action sequences with corresponding poses and labels (available at https://github.com/celikcancglab/NOVAction23). In NOVAction23, the performed motions and viewpoints are varied on the fly through procedural generation, to ensure that, for a given action class, each generated sequence features a distinct motion performed by one of the 1,105 synthetic humans captured from a unique viewpoint. Moreover, each synthetic human is unique in terms of body shape (height and weight), skin tone, gender, hair, facial hair, clothing, shoes and accessories. To further increase data diversity, the motion sequences are rendered under various weather conditions and at different times of day, across three outdoor and two indoor settings. We evaluate NOVAction23 by training three state-of-the-art recognizers on it, in addition to the NTU 120 dataset, and corroborating using real-world videos from YouTube. Our results confirm that the NOVAction23 dataset can improve the performance of state-of-the-art human action recognition.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据