首页|Compressing Large Language Models with PCA Without Performance Loss

Compressing Large Language Models with PCA Without Performance Loss

来源：

英文摘要

We demonstrate that Principal Component Analysis (PCA), when applied in a structured manner, either to polar-transformed images or segment-wise to token sequences, enables extreme compression of neural models without sacrificing performance. Across three case studies, we show that a one-layer classifier trained on PCA-compressed polar MNIST achieves over 98 percent accuracy using only 840 parameters. A two-layer transformer trained on 70-dimensional PCA-reduced MiniLM embeddings reaches 76.62 percent accuracy on the 20 Newsgroups dataset with just 81000 parameters. A decoder-only transformer generates coherent token sequences from 70-dimensional PCA embeddings while preserving over 97 percent cosine similarity with full MiniLM representations, using less than 17 percent of the parameter count of GPT-2. These results highlight PCA-based input compression as a general and effective strategy for aligning model capacity with information content, enabling lightweight architectures across multiple modalities.

作者：Magnus Bengtsson

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Magnus Bengtsson.Compressing Large Language Models with PCA Without Performance Loss[EB/OL].(2025-08-06)[2025-08-16].https://arxiv.org/abs/2508.04307.点此复制

Compressing Large Language Models with PCA Without Performance Loss

Compressing Large Language Models with PCA Without Performance Loss

评论