|国家预印本平台
首页|A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

来源:Arxiv_logoArxiv
英文摘要

Large Language Models (LLMs) exhibit significant activation sparsity, where only a subset of neurons are active for a given input. Although this sparsity presents opportunities to reduce computational cost, efficiently utilizing it requires predicting activation patterns in a scalable manner. However, direct prediction at the neuron level is computationally expensive due to the vast number of neurons in modern LLMs. To enable efficient prediction and utilization of activation sparsity, we propose a clustering-based activation pattern compression framework. Instead of treating each neuron independently, we group similar activation patterns into a small set of representative clusters. Our method achieves up to 79.34% clustering precision, outperforming standard binary clustering approaches while maintaining minimal degradation in perplexity (PPL) scores. With a sufficiently large number of clusters, our approach attains a PPL score as low as 12.49, demonstrating its effectiveness in preserving model quality while reducing computational overhead. By predicting cluster assignments rather than individual neuron states, future models can efficiently infer activation patterns from pre-computed centroids. We detail the clustering algorithm, analyze its effectiveness in capturing meaningful activation structures, and demonstrate its potential to improve sparse computation efficiency. This clustering-based formulation serves as a foundation for future work on activation pattern prediction, paving the way for efficient inference in large-scale language models.

Nobel Dhar、Bobin Deng、Md Romyull Islam、Xinyue Zhang、Kazi Fahim Ahmad Nasif、Kun Suo

计算技术、计算机技术

Nobel Dhar,Bobin Deng,Md Romyull Islam,Xinyue Zhang,Kazi Fahim Ahmad Nasif,Kun Suo.A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering[EB/OL].(2025-07-11)[2025-08-16].https://arxiv.org/abs/2507.14179.点此复制

评论