首页|Evaluating Model Explanations without Ground Truth

Evaluating Model Explanations without Ground Truth

来源：

英文摘要

There can be many competing and contradictory explanations for a single model prediction, making it difficult to select which one to use. Current explanation evaluation frameworks measure quality by comparing against ideal "ground-truth" explanations, or by verifying model sensitivity to important inputs. We outline the limitations of these approaches, and propose three desirable principles to ground the future development of explanation evaluation strategies for local feature importance explanations. We propose a ground-truth Agnostic eXplanation Evaluation framework (AXE) for evaluating and comparing model explanations that satisfies these principles. Unlike prior approaches, AXE does not require access to ideal ground-truth explanations for comparison, or rely on model sensitivity - providing an independent measure of explanation quality. We verify AXE by comparing with baselines, and show how it can be used to detect explanation fairwashing. Our code is available at https://github.com/KaiRawal/Evaluating-Model-Explanations-without-Ground-Truth.

作者：Kaivalya Rawal、Zihao Fu、Eoin Delaney、Chris Russell

作者单位：

DOI：10.1145/3715275.3732219

学科分类：计算技术、计算机技术

推荐引用：Kaivalya Rawal,Zihao Fu,Eoin Delaney,Chris Russell.Evaluating Model Explanations without Ground Truth[EB/OL].(2025-05-15)[2025-06-27].https://arxiv.org/abs/2505.10399.点此复制

Evaluating Model Explanations without Ground Truth

Evaluating Model Explanations without Ground Truth

评论