基于交叉注意力的声纹识别与语音欺骗检测集成方法
Integration for Voiceprint Recognition and Speech Spoofing Detection Based on Cross-Attention
在声纹识别与语音欺骗检测集成领域,现有后端嵌入融合方法存在一些不足,难以充分挖掘声纹识别与欺骗检测任务之间的内在联系。本文提出了一种嵌入融合算法。通过深入剖析声纹嵌入和欺骗嵌入的独特关系,对传统交叉注意力机制进行改进,从而获取全面的全局特征交互信息,并结合卷积模块精准捕捉声纹嵌入与欺骗嵌入之间复杂的关联;为进一步优化集成效果,设计了嵌入选择性融合模块,以自适应调整贡献权重,避免当前集成方法中普遍存在的偏向性问题。该算法以一对注册语音和测试语音作为输入数据。首先,通过声纹识别子系统和语音欺骗检测子系统分别提取声纹嵌入和欺骗嵌入。接着,运用基于交叉注意力的嵌入融合方法,对这两种嵌入进行深度融合处理。最后,经分类层输出最终结果,确保整个系统运行的稳定性和可靠性。在ASVspoof2019数据集上的实验结果显示,本文提出的模型在性能上实现了显著提升,有力验证了算法的有效性和优越性。
In the field of integrating voiceprint recognition and speech spoofing detection, existing back - end embedding fusion methods have some deficiencies and are difficult to fully explore the internal relationships between voiceprint recognition and spoofing detection tasks. This paper proposes an embedding fusion algorithm. By deeply analyzing the unique relationships between voiceprint embeddings and spoofing embeddings, the traditional cross - attention mechanism is improved to obtain comprehensive global feature interaction information. Moreover, in combination with the convolutional module, it can accurately capture the complex correlations between voiceprint embeddings and spoofing embeddings. To further optimize the integration effect, an embedding selective fusion module is designed to adaptively adjust the contribution weights and avoid the bias problems commonly existing in current integration methods.This algorithm takes a pair of registered speech and test speech as input data. Firstly, the voiceprint recognition subsystem and the speech spoofing detection subsystem are used to extract voiceprint embeddings and spoofing embeddings respectively. Then, an embedding fusion method based on cross - attention is applied to deeply fuse these two types of embeddings. Finally, the classification layer outputs the final result, ensuring the stability and reliability of the entire system operation. The experimental results on the ASVspoof2019 dataset show that the model proposed in this paper has achieved a significant performance improvement, which effectively verifies the effectiveness and superiority of the algorithm.
李媛薇、彭海朋
计算技术、计算机技术
信息安全声纹识别语音欺骗检测交叉注意力
Information SecurityVoiceprint RecognitionSpeech Spoofing DetectionCross-Attention
李媛薇,彭海朋.基于交叉注意力的声纹识别与语音欺骗检测集成方法[EB/OL].(2025-02-25)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/202502-108.点此复制
评论