|国家预印本平台
首页|Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

来源:Arxiv_logoArxiv
英文摘要

Model merging has emerged as an efficient method to combine multiple single-task fine-tuned models. The merged model can enjoy multi-task capabilities without expensive training. While promising, merging into a single model often suffers from an accuracy gap with respect to individual fine-tuned models. On the other hand, deploying all individual fine-tuned models incurs high costs. We propose FlexMerge, a novel data-free model merging framework to flexibly generate merged models of varying sizes, spanning the spectrum from a single merged model to retaining all individual fine-tuned models. FlexMerge treats fine-tuned models as collections of sequential blocks and progressively merges them using any existing data-free merging method, halting at a desired size. We systematically explore the accuracy-size trade-off exhibited by different merging algorithms in combination with FlexMerge. Extensive experiments on vision and NLP benchmarks, with up to 30 tasks, reveal that even modestly larger merged models can provide substantial accuracy improvements over a single model. By offering fine-grained control over fused model size, FlexMerge provides a flexible, data-free, and high-performance solution for diverse deployment scenarios.

Akash Dhasade、Divyansh Jhunjhunwala、Milos Vujasinovic、Gauri Joshi、Anne-Marie Kermarrec

计算技术、计算机技术

Akash Dhasade,Divyansh Jhunjhunwala,Milos Vujasinovic,Gauri Joshi,Anne-Marie Kermarrec.Navigating the Accuracy-Size Trade-Off with Flexible Model Merging[EB/OL].(2025-05-29)[2025-06-07].https://arxiv.org/abs/2505.23209.点此复制

评论