首页|MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

来源：

英文摘要

Speaker verification systems are increasingly deployed in security-sensitive applications but remain highly vulnerable to adversarial perturbations. In this work, we propose the Mask Diffusion Detector (MDD), a novel adversarial detection and purification framework based on a \textit{text-conditioned masked diffusion model}. During training, MDD applies partial masking to Mel-spectrograms and progressively adds noise through a forward diffusion process, simulating the degradation of clean speech features. A reverse process then reconstructs the clean representation conditioned on the input transcription. Unlike prior approaches, MDD does not require adversarial examples or large-scale pretraining. Experimental results show that MDD achieves strong adversarial detection performance and outperforms prior state-of-the-art methods, including both diffusion-based and neural codec-based approaches. Furthermore, MDD effectively purifies adversarially-manipulated speech, restoring speaker verification performance to levels close to those observed under clean conditions. These findings demonstrate the potential of diffusion-based masking strategies for secure and reliable speaker verification systems.

作者：Yibo Bai、Sizhou Chen、Michele Panariello、Xiao-Lei Zhang、Massimiliano Todisco、Nicholas Evans

作者单位：

学科分类：通信无线通信

推荐引用：Yibo Bai,Sizhou Chen,Michele Panariello,Xiao-Lei Zhang,Massimiliano Todisco,Nicholas Evans.MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations[EB/OL].(2025-08-26)[2025-09-10].https://arxiv.org/abs/2508.19180.点此复制

MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

评论