期刊
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
卷 -, 期 -, 页码 15024-15034出版社
IEEE
DOI: 10.1109/ICCV48922.2021.01477
关键词
-
资金
- NSFC [62072382]
- Fundamental Research Funds for Central Universities, China [20720190003]
This research introduces a novel end-to-end framework for video face forgery detection that utilizes temporal coherence. By combining a fully temporal convolution network and a Temporal Transformer network, the framework is able to extract temporal features and explore long-term temporal coherence effectively.
Although current face manipulation techniques achieve impressive performance regarding quality and controllability, they are struggling to generate temporal coherent face videos. In this work, we explore to take full advantage of the temporal coherence for video face forgery detection. To achieve this, we propose a novel end-to-end framework, which consists of two major stages. The first stage is a fully temporal convolution network (FTCN). The key insight of FTCN is to reduce the spatial convolution kernel size to 1, while maintaining the temporal convolution kernel size unchanged. We surprisingly find this special design can benefit the model for extracting the temporal features as well as improve the generalization capability. The second stage is a Temporal Transformer network, which aims to explore the long-term temporal coherence. The proposed framework is general and flexible, which can be directly trained from scratch without any pre-training models or external datasets. Extensive experiments show that our framework outperforms existing methods and remains effective when applied to detect new sorts of face forgery videos.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据