|国家预印本平台
| 注册
首页|70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Tianyi Zhang Mohsen Hariri Shaochen Zhong Vipin Chaudhary Yang Sui Xia Hu Anshumali Shrivastava

Arxiv_logoArxiv

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Tianyi Zhang Mohsen Hariri Shaochen Zhong Vipin Chaudhary Yang Sui Xia Hu Anshumali Shrivastava

作者信息

引用本文复制引用

Tianyi Zhang,Mohsen Hariri,Shaochen Zhong,Vipin Chaudhary,Yang Sui,Xia Hu,Anshumali Shrivastava.70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float[EB/OL].(2025-09-19)[2025-12-14].https://arxiv.org/abs/2504.11651.

学科分类

计算技术、计算机技术

评论

首发时间 2025-09-19
下载量:0
|
点击量:2
段落导航相关论文