How Data Inter-connectivity Shapes LLMs Unlearning: A Structural Unlearning Perspective
How Data Inter-connectivity Shapes LLMs Unlearning: A Structural Unlearning Perspective
While unlearning knowledge from large language models (LLMs) is receiving increasing attention, one important aspect remains unexplored. Existing approaches and benchmarks assume data points to-be-forgotten are independent, ignoring their inter-connectivity - a fundamental characteristic of real-world data structures. In this paper, we propose PISTOL, a method for compiling structural datasets. PISTOL leverages the inherently structured nature of contractual relationships, offering several key benefits. First, it enables insights into the impact of structural data on unlearning effectiveness. Second, it provides precise and concise ground truths for clearer evaluation. Third, its attribute generation does not require input from pre-trained LLMs, mitigating confounding risks. Leveraging datasets synthesized using PISTOL, we demonstrate how data inter-connectivity impacts LLM unlearning. Specifically, (a) in both the pre-trained and fine-tuned models, unlearning difficulty increases as data inter-connectivity grows, (b) there is a positive correlation between the density of the knowledge graph and unlearning difficulty, and (c) when the to-be-forgotten data is skewed towards one domain, balancing retaining performance across all domains is challenging.
Yihong Chen、Nicholas D. Lane、Nicola Cancedda、Xinchi Qiu、William F. Shen、Pontus Stenetorp、Meghdad Kurmanji
计算技术、计算机技术
Yihong Chen,Nicholas D. Lane,Nicola Cancedda,Xinchi Qiu,William F. Shen,Pontus Stenetorp,Meghdad Kurmanji.How Data Inter-connectivity Shapes LLMs Unlearning: A Structural Unlearning Perspective[EB/OL].(2024-06-24)[2025-05-02].https://arxiv.org/abs/2406.16810.点此复制
评论