|国家预印本平台
首页|Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

来源:Arxiv_logoArxiv
英文摘要

Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling low-cost statistics-based spoofing attacks. This work breaks this trade-off by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a novel watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). It achieves a Pareto improvement, increasing the resilience against scrubbing attacks without compromising robustness to spoofing. Experiments demonstrate SEEK's superiority over prior method, yielding spoofing robustness gains of +88.2%/+92.3%/+82.0% and scrubbing robustness gains of +10.2%/+6.4%/+24.6% across diverse dataset settings.

Huanming Shen、Baizhou Huang、Xiaojun Wan

计算技术、计算机技术

Huanming Shen,Baizhou Huang,Xiaojun Wan.Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks[EB/OL].(2025-07-08)[2025-07-25].https://arxiv.org/abs/2507.06274.点此复制

评论