|国家预印本平台
首页|To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

来源:Arxiv_logoArxiv
英文摘要

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model through task arithmetic, offer a promising solution. However, task interference remains a fundamental challenge, leading to performance degradation and suboptimal merged models. Existing approaches largely overlook the fundamental roles of neurons, their connectivity, and activation, resulting in a merging process and a merged model that does not consider how neurons relay and process information. In this work, we present the first study that relies on neuronal mechanisms for model merging. We decompose task-specific representations into two complementary neuronal subspaces that regulate neuron sensitivity and input adaptability. Leveraging this decomposition, we introduce NeuroMerging, a novel merging framework developed to mitigate task interference within neuronal subspaces, enabling training-free model fusion across diverse tasks. Through extensive experiments, we demonstrate that NeuroMerging achieves superior performance compared to existing methods on multi-task benchmarks across both natural language and vision domains. Our findings highlight the importance of aligning neuronal mechanisms in model merging, offering new insights into mitigating task interference and improving knowledge fusion. Code will be released upon acceptance.

Zitao Fang、Guodong DU、Shuyang Yu、Yifei Guo、Yiwei Zhang、Yiyao Cao、Jing Li、Ho-Kin Tang、Sim Kuan Goh

计算技术、计算机技术

Zitao Fang,Guodong DU,Shuyang Yu,Yifei Guo,Yiwei Zhang,Yiyao Cao,Jing Li,Ho-Kin Tang,Sim Kuan Goh.To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging[EB/OL].(2025-03-07)[2025-06-09].https://arxiv.org/abs/2503.05320.点此复制

评论