基于稀疏诱导的抗剪枝后门攻击
后门攻击是在模型训练阶段引入的严重威胁,其通过在训练过程中植入特定的触发模式,使深度神经网络在特定输入下产生错误分类,从而引发例如自动驾驶安全等实际场景的危害。模型剪枝是降低网络参数规模以实现在低资源设备上部署的常用技术之一,已有后门攻击在模型剪枝后,其攻击有效性会显著降低或失效。针对现有后门攻击因模型剪枝过程导致的脆弱性问题,本文首次提出了一种基于稀疏诱导的新型抗剪枝后门攻击,攻击者训练后门模型时考虑剪枝操作的影响,在标准后门损失函数中引入结构化约束项,使模型呈现出类似于剪枝后的稀疏结构,使得生成的后门模型即使在被剪枝后仍保留强大的攻击能力。本文使用三种后门注入方法和三种剪枝策略,并在两种数据集和三种模型架构上评估了抗剪枝的后门攻击方法。结果表明,与现有的后门攻击相比,抗剪枝后门攻击在剪枝后实现了高达97.82%的攻击成功率,同时也保持了模型对干净输入的预测准确性。
Backdoor attacks pose a significant threat to deep neural networks, as they are introduced during the training phase. By embedding specific trigger patterns into the model, these attacks cause misclassification when presented with certain inputs. This vulnerability can have serious real-world consequences, particularly in safety-critical applications like autonomous driving. Model pruning is a widely used technique for reducing the scale of network parameters, making it feasible to deploy models on low-resource devices. Notably, after pruning, the effectiveness of backdoor attacks is often significantly weakened or even completely nullified.To address the vulnerability defects caused by the model pruning process of existing backdoor attacks, we further propose a new anti-pruning backdoor attack based on sparse induction, where an adversary trains the backdoored model and simultaneously considers the impact of the pruning operation. By introducing structured constraint terms into the standard backdoor loss functions, we enable the model to exhibit a sparse structure similar to that after pruning. This approach ensures that the produced backdoored model retains strong attack capability even after being pruned.We focus on three backdoor injection methods and three pruning strategies, and evaluates the anti-pruning backdoor attack method in two datasets and three model architectures. The results show that, compared to existing backdoor attacks, the anti-pruning backdoor attack achieves an attack success rate of up to 97.82% after pruning while maintaining prediction accuracy on clean inputs.
计算技术、计算机技术
深度神经网络神经网络剪枝后门攻击稀疏训练
deep neural networksneural network pruningbackdoor attacksparse training
.基于稀疏诱导的抗剪枝后门攻击[EB/OL].(2025-04-01)[2025-04-03].http://www.paper.edu.cn/releasepaper/content/202504-12.点此复制
评论