首页|Sparse Activation Editing for Reliable Instruction Following in Narratives

Sparse Activation Editing for Reliable Instruction Following in Narratives

来源：

英文摘要

Complex narrative contexts often challenge language models' ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly evaluate our method, we introduce FreeInstruct, a diverse and realistic benchmark of 1,212 examples that highlights the challenges of instruction following in narrative-rich settings. While initially motivated by complex narratives, Concise-SAE demonstrates state-of-the-art instruction adherence across varied tasks without compromising generation quality.

作者：Runcong Zhao、Chengyu Cao、Qinglin Zhu、Xiucheng Lv、Shun Shao、Lin Gui、Ruifeng Xu、Yulan He

作者单位：

学科分类：语言学

推荐引用：Runcong Zhao,Chengyu Cao,Qinglin Zhu,Xiucheng Lv,Shun Shao,Lin Gui,Ruifeng Xu,Yulan He.Sparse Activation Editing for Reliable Instruction Following in Narratives[EB/OL].(2025-05-22)[2025-06-21].https://arxiv.org/abs/2505.16505.点此复制

Sparse Activation Editing for Reliable Instruction Following in Narratives

Sparse Activation Editing for Reliable Instruction Following in Narratives

评论