首页|GRADA: Graph-based Reranker against Adversarial Documents Attack

GRADA: Graph-based Reranker against Adversarial Documents Attack

来源：

英文摘要

Retrieval Augmented Generation (RAG) frameworks improve the accuracy of large language models (LLMs) by integrating external knowledge from retrieved documents, thereby overcoming the limitations of models' static intrinsic knowledge. However, these systems are susceptible to adversarial attacks that manipulate the retrieval process by introducing documents that are adversarial yet semantically similar to the query. Notably, while these adversarial documents resemble the query, they exhibit weak similarity to benign documents in the retrieval set. Thus, we propose a simple yet effective Graph-based Reranking against Adversarial Document Attacks (GRADA) framework aiming at preserving retrieval quality while significantly reducing the success of adversaries. Our study evaluates the effectiveness of our approach through experiments conducted on five LLMs: GPT-3.5-Turbo, GPT-4o, Llama3.1-8b, Llama3.1-70b, and Qwen2.5-7b. We use three datasets to assess performance, with results from the Natural Questions dataset demonstrating up to an 80% reduction in attack success rates while maintaining minimal loss in accuracy.

作者：Jingjie Zheng、Aryo Pradipta Gema、Giwon Hong、Xuanli He、Pasquale Minervini、Youcheng Sun、Qiongkai Xu

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Jingjie Zheng,Aryo Pradipta Gema,Giwon Hong,Xuanli He,Pasquale Minervini,Youcheng Sun,Qiongkai Xu.GRADA: Graph-based Reranker against Adversarial Documents Attack[EB/OL].(2025-05-12)[2025-07-16].https://arxiv.org/abs/2505.07546.点此复制

GRADA: Graph-based Reranker against Adversarial Documents Attack

GRADA: Graph-based Reranker against Adversarial Documents Attack

评论