首页|Towards Explainable Fake Image Detection with Multi-Modal Large Language Models

Towards Explainable Fake Image Detection with Multi-Modal Large Language Models

来源：

英文摘要

Progress in image generation raises significant public security concerns. We argue that fake image detection should not operate as a "black box". Instead, an ideal approach must ensure both strong generalization and transparency. Recent progress in Multi-modal Large Language Models (MLLMs) offers new opportunities for reasoning-based AI-generated image detection. In this work, we evaluate the capabilities of MLLMs in comparison to traditional detection methods and human evaluators, highlighting their strengths and limitations. Furthermore, we design six distinct prompts and propose a framework that integrates these prompts to develop a more robust, explainable, and reasoning-driven detection system. The code is available at https://github.com/Gennadiyev/mllm-defake.

作者：Yikun Ji、Yan Hong、Jiahui Zhan、Haoxing Chen、jun lan、Huijia Zhu、Weiqiang Wang、Liqing Zhang、Jianfu Zhang

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Yikun Ji,Yan Hong,Jiahui Zhan,Haoxing Chen,jun lan,Huijia Zhu,Weiqiang Wang,Liqing Zhang,Jianfu Zhang.Towards Explainable Fake Image Detection with Multi-Modal Large Language Models[EB/OL].(2025-04-19)[2025-05-04].https://arxiv.org/abs/2504.14245.点此复制

Towards Explainable Fake Image Detection with Multi-Modal Large Language Models

Towards Explainable Fake Image Detection with Multi-Modal Large Language Models

评论