Transfer Attack for Bad and Good: Explain and Boost Adversarial Transferability across Multimodal Large Language Models
Transfer Attack for Bad and Good: Explain and Boost Adversarial Transferability across Multimodal Large Language Models
Multimodal Large Language Models (MLLMs) demonstrate exceptional performance in cross-modality interaction, yet they also suffer adversarial vulnerabilities. In particular, the transferability of adversarial examples remains an ongoing challenge. In this paper, we specifically analyze the manifestation of adversarial transferability among MLLMs and identify the key factors that influence this characteristic. We discover that the transferability of MLLMs exists in cross-LLM scenarios with the same vision encoder and indicate \underline{\textit{two key Factors}} that may influence transferability. We provide two semantic-level data augmentation methods, Adding Image Patch (AIP) and Typography Augment Transferability Method (TATM), which boost the transferability of adversarial examples across MLLMs. To explore the potential impact in the real world, we utilize two tasks that can have both negative and positive societal impacts: \ding{182} Harmful Content Insertion and \ding{183} Information Protection.
Hao Cheng、Erjia Xiao、Jiayan Yang、Jinhao Duan、Yichi Wang、Jiahang Cao、Qiang Zhang、Le Yang、Kaidi Xu、Jindong Gu、Renjing Xu
计算技术、计算机技术
Hao Cheng,Erjia Xiao,Jiayan Yang,Jinhao Duan,Yichi Wang,Jiahang Cao,Qiang Zhang,Le Yang,Kaidi Xu,Jindong Gu,Renjing Xu.Transfer Attack for Bad and Good: Explain and Boost Adversarial Transferability across Multimodal Large Language Models[EB/OL].(2025-07-07)[2025-07-17].https://arxiv.org/abs/2405.20090.点此复制
评论