首页|Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models

Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models

来源：

英文摘要

Mixture-of-Experts (MoE) architectures have emerged as a promising paradigm for scaling large language models (LLMs) with sparse activation of task-specific experts. Despite their computational efficiency during inference, the massive overall parameter footprint of MoE models (e.g., GPT-4) introduces critical challenges for practical deployment. Current pruning approaches often fail to address two inherent characteristics of MoE systems: 1).intra-layer expert homogeneity where experts within the same MoE layer exhibit functional redundancy, and 2). inter-layer similarity patterns where deeper layers tend to contain progressively more homogeneous experts. To tackle these issues, we propose Cluster-driven Expert Pruning (C-Prune), a novel two-stage framework for adaptive task-specific compression of MoE LLMs. C-Prune operates through layer-wise expert clustering, which groups functionally similar experts within each MoE layer using parameter similarity metrics, followed by global cluster pruning, which eliminates redundant clusters across all layers through a unified importance scoring mechanism that accounts for cross-layer homogeneity. We validate C-Prune through extensive experiments on multiple MoE models and benchmarks. The results demonstrate that C-Prune effectively reduces model size while outperforming existing MoE pruning methods.

作者：Zhoujun Li、Hongcheng Guo、Juntao Yao、Boyang Wang、Junjia Du、Shaosheng Cao、Donglin Di、Shun Zhang

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Zhoujun Li,Hongcheng Guo,Juntao Yao,Boyang Wang,Junjia Du,Shaosheng Cao,Donglin Di,Shun Zhang.Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models[EB/OL].(2025-04-10)[2025-07-16].https://arxiv.org/abs/2504.07807.点此复制

Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models

Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models

评论