|国家预印本平台
首页|Pixel-level Certified Explanations via Randomized Smoothing

Pixel-level Certified Explanations via Randomized Smoothing

Pixel-level Certified Explanations via Randomized Smoothing

来源:Arxiv_logoArxiv
英文摘要

Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels. However, these explanations are highly non-robust: small, imperceptible input perturbations can drastically alter the attribution map while maintaining the same prediction. This vulnerability undermines their trustworthiness and calls for rigorous robustness guarantees of pixel-level attribution scores. We introduce the first certification framework that guarantees pixel-level robustness for any black-box attribution method using randomized smoothing. By sparsifying and smoothing attribution maps, we reformulate the task as a segmentation problem and certify each pixel's importance against $\ell_2$-bounded perturbations. We further propose three evaluation metrics to assess certified robustness, localization, and faithfulness. An extensive evaluation of 12 attribution methods across 5 ImageNet models shows that our certified attributions are robust, interpretable, and faithful, enabling reliable use in downstream tasks. Our code is at https://github.com/AlaaAnani/certified-attributions.

Alaa Anani、Tobias Lorenz、Mario Fritz、Bernt Schiele

计算技术、计算机技术

Alaa Anani,Tobias Lorenz,Mario Fritz,Bernt Schiele.Pixel-level Certified Explanations via Randomized Smoothing[EB/OL].(2025-06-18)[2025-07-16].https://arxiv.org/abs/2506.15499.点此复制

评论