Generalizability vs. Counterfactual Explainability Trade-Off
Generalizability vs. Counterfactual Explainability Trade-Off
In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of $\varepsilon$-valid counterfactual probability ($\varepsilon$-VCP) -- the probability of finding perturbations of a data point within its $\varepsilon$-neighborhood that result in a label change. We provide a theoretical analysis of $\varepsilon$-VCP in relation to the geometry of the model's decision boundary, showing that $\varepsilon$-VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting $\varepsilon$-VCP as a practical proxy for quantitatively characterizing overfitting.
Fabiano Veglianti、Flavio Giorgi、Fabrizio Silvestri、Gabriele Tolomei
计算技术、计算机技术
Fabiano Veglianti,Flavio Giorgi,Fabrizio Silvestri,Gabriele Tolomei.Generalizability vs. Counterfactual Explainability Trade-Off[EB/OL].(2025-05-29)[2025-06-14].https://arxiv.org/abs/2505.23225.点此复制
评论