WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
Recent advances in text-to-image (T2I) generation have achieved impressive results, yet existing models still struggle with prompts that require rich world knowledge and implicit reasoning: both of which are critical for producing semantically accurate, coherent, and contextually appropriate images in real-world scenarios. To address this gap, we introduce \textbf{WorldGenBench}, a benchmark designed to systematically evaluate T2I models' world knowledge grounding and implicit inferential capabilities, covering both the humanities and nature domains. We propose the \textbf{Knowledge Checklist Score}, a structured metric that measures how well generated images satisfy key semantic expectations. Experiments across 21 state-of-the-art models reveal that while diffusion models lead among open-source methods, proprietary auto-regressive models like GPT-4o exhibit significantly stronger reasoning and knowledge integration. Our findings highlight the need for deeper understanding and inference capabilities in next-generation T2I systems. Project Page: \href{https://dwanzhang-ai.github.io/WorldGenBench/}{https://dwanzhang-ai.github.io/WorldGenBench/}
Ruoshi Xu、Daoan Zhang、Che Jiang、Biaoxiang Chen、Zijian Jin、Yutian Lu、Jianguo Zhang、Liang Yong、Jiebo Luo、Shengda Luo
计算技术、计算机技术
Ruoshi Xu,Daoan Zhang,Che Jiang,Biaoxiang Chen,Zijian Jin,Yutian Lu,Jianguo Zhang,Liang Yong,Jiebo Luo,Shengda Luo.WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation[EB/OL].(2025-05-02)[2025-06-25].https://arxiv.org/abs/2505.01490.点此复制
评论