Video Forgery Detection with Optical Flow Residuals and Spatial-Temporal Consistency
Video Forgery Detection with Optical Flow Residuals and Spatial-Temporal Consistency
The rapid advancement of diffusion-based video generation models has led to increasingly realistic synthetic content, presenting new challenges for video forgery detection. Existing methods often struggle to capture fine-grained temporal inconsistencies, particularly in AI-generated videos with high visual fidelity and coherent motion. In this work, we propose a detection framework that leverages spatial-temporal consistency by combining RGB appearance features with optical flow residuals. The model adopts a dual-branch architecture, where one branch analyzes RGB frames to detect appearance-level artifacts, while the other processes flow residuals to reveal subtle motion anomalies caused by imperfect temporal synthesis. By integrating these complementary features, the proposed method effectively detects a wide range of forged videos. Extensive experiments on text-to-video and image-to-video tasks across ten diverse generative models demonstrate the robustness and strong generalization ability of the proposed approach.
Kunio Suzuki、Nabarun Goswami、Takuya Shintate、Xi Xue
计算技术、计算机技术
Kunio Suzuki,Nabarun Goswami,Takuya Shintate,Xi Xue.Video Forgery Detection with Optical Flow Residuals and Spatial-Temporal Consistency[EB/OL].(2025-08-01)[2025-08-11].https://arxiv.org/abs/2508.00397.点此复制
评论