首页|The Security Threat of Compressed Projectors in Large Vision-Language Models

The Security Threat of Compressed Projectors in Large Vision-Language Models

来源：

英文摘要

The choice of a suitable visual language projector (VLP) is critical to the successful training of large visual language models (LVLMs). Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offering distinct advantages in performance and computational efficiency. However, their security implications have not been thoroughly examined. Our comprehensive evaluation reveals significant differences in their security profiles: compressed projectors exhibit substantial vulnerabilities, allowing adversaries to successfully compromise LVLMs even with minimal knowledge of structural information. In stark contrast, uncompressed projectors demonstrate robust security properties and do not introduce additional vulnerabilities. These findings provide critical guidance for researchers in selecting optimal VLPs that enhance the security and reliability of visual language models. The code will be released.

作者：Yudong Zhang、Ruobing Xie、Xingwu Sun、Jiansheng Chen、Zhanhui Kang、Di Wang、Yu Wang

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Yudong Zhang,Ruobing Xie,Xingwu Sun,Jiansheng Chen,Zhanhui Kang,Di Wang,Yu Wang.The Security Threat of Compressed Projectors in Large Vision-Language Models[EB/OL].(2025-05-31)[2025-06-21].https://arxiv.org/abs/2506.00534.点此复制

The Security Threat of Compressed Projectors in Large Vision-Language Models

The Security Threat of Compressed Projectors in Large Vision-Language Models

评论