首页|MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

来源：

英文摘要

This paper introduces MEBench, a novel benchmark for evaluating mutual exclusivity (ME) bias, a cognitive phenomenon observed in children during word learning. Unlike traditional ME tasks, MEBench further incorporates spatial reasoning to create more challenging and realistic evaluation settings. We assess the performance of state-of-the-art vision-language models (VLMs) on this benchmark using novel evaluation metrics that capture key aspects of ME-based reasoning. To facilitate controlled experimentation, we also present a flexible and scalable data generation pipeline that supports the construction of diverse annotated scenes.

作者：Anh Thai、Stefan Stojanov、Zixuan Huang、Bikram Boote、James M. Rehg

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Anh Thai,Stefan Stojanov,Zixuan Huang,Bikram Boote,James M. Rehg.MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models[EB/OL].(2025-05-26)[2025-06-08].https://arxiv.org/abs/2505.20122.点此复制

MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

评论