首页|Aligning Explanations with Human Communication

Aligning Explanations with Human Communication

来源：

英文摘要

Machine learning explainability aims to make the decision-making process of black-box models more transparent by finding the most important input features for a given prediction task. Recent works have proposed composing explanations from semantic concepts (e.g., colors, patterns, shapes) that are inherently interpretable to the user of a model. However, these methods generally ignore the communicative context of explanation-the ability of the user to understand the prediction of the model from the explanation. For example, while a medical doctor might understand an explanation in terms of clinical markers, a patient may need a more accessible explanation to make sense of the same diagnosis. In this paper, we address this gap with listener-adaptive explanations. We propose an iterative procedure grounded in principles of pragmatic reasoning and the rational speech act to generate explanations that maximize communicative utility. Our procedure only needs access to pairwise preferences between candidate explanations, relevant in real-world scenarios where a listener model may not be available. We evaluate our method in image classification tasks, demonstrating improved alignment between explanations and listener preferences across three datasets. Furthermore, we perform a user study that demonstrates our explanations increase communicative utility.

作者：Jeremias Sulam、Jacopo Teneggi、Zhenzhen Wang、Paul H. Yi、Tianmin Shu

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Jeremias Sulam,Jacopo Teneggi,Zhenzhen Wang,Paul H. Yi,Tianmin Shu.Aligning Explanations with Human Communication[EB/OL].(2025-05-21)[2025-07-16].https://arxiv.org/abs/2505.15626.点此复制

Aligning Explanations with Human Communication

Aligning Explanations with Human Communication

评论