|国家预印本平台
首页|Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC

Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC

Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC

来源:Arxiv_logoArxiv
英文摘要

The increasing deployment of Large Language Models (LLMs) on edge devices, driven by model advancements and hardware improvements, offers significant privacy benefits. However, these on-device LLMs inherently face performance limitations due to reduced model capacity and necessary compression techniques. To address this, we introduce a systematic methodology -- encompassing model capability, development efficiency, and system resources -- for evaluating on-device LLMs. Our comprehensive evaluation, encompassing models from 0.5B to 14B parameters and seven post-training quantization (PTQ) methods on commodity laptops, yields several critical insights: 1) System-level metrics exhibit near-linear scaling with effective bits-per-weight (BPW). 2) A practical threshold exists around $\sim$3.5 effective BPW, larger models subjected to low-bit quantization consistently outperform smaller models utilizing higher bit-precision. 3) Quantization with low BPW incurs marginal accuracy loss but significant memory savings. 4) Determined by low-level implementation specifics power consumption on CPU, where computation-intensive operations spend more power than memory-intensive ones. These findings offer crucial insights and practical guidelines for the efficient deployment and optimized configuration of LLMs on resource-constrained edge devices. Our codebase is available at https://github.com/simmonssong/LLMOnDevice.

Qingyu Song、Peiyu Liao、Wenqian Zhao、Yiwen Wang、Shoubo Hu、Hui-Ling Zhen、Ning Jiang、Mingxuan Yuan

计算技术、计算机技术

Qingyu Song,Peiyu Liao,Wenqian Zhao,Yiwen Wang,Shoubo Hu,Hui-Ling Zhen,Ning Jiang,Mingxuan Yuan.Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC[EB/OL].(2025-05-20)[2025-07-21].https://arxiv.org/abs/2505.15030.点此复制

评论