首页|Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

来源：

英文摘要

The reasoning ability of large language models (LLMs) has been rapidly advancing in recent years, attracting interest in more fundamental approaches that can reliably enhance their generalizability. This work demonstrates that model complexity control, conveniently implementable by adjusting the initialization rate and weight decay coefficient, improves the scaling law of LLMs consistently over varying model sizes and data sizes. This gain is further illustrated by comparing the benchmark performance of 2.4B models pretrained on 1T tokens with different complexity hyperparameters. Instead of fixing the initialization std, we found that a constant initialization rate (the exponent of std) enables the scaling law to descend faster in both model and data sizes. These results indicate that complexity control is a promising direction for the continual advancement of LLMs.

作者：Liangkai Hang、Junjie Yao、Zhiwei Bai、Tianyi Chen、Yang Chen、Rongjie Diao、Hezhou Li、Pengxiao Lin、Zhiwei Wang、Cheng Xu、Zhongwang Zhang、Zhangchen Zhou、Zhiyu Li、Zehao Lin、Kai Chen、Feiyu Xiong、Yaoyu Zhang、Weinan E、Hongkang Yang、Zhi-Qin John Xu

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Liangkai Hang,Junjie Yao,Zhiwei Bai,Tianyi Chen,Yang Chen,Rongjie Diao,Hezhou Li,Pengxiao Lin,Zhiwei Wang,Cheng Xu,Zhongwang Zhang,Zhangchen Zhou,Zhiyu Li,Zehao Lin,Kai Chen,Feiyu Xiong,Yaoyu Zhang,Weinan E,Hongkang Yang,Zhi-Qin John Xu.Scalable Complexity Control Facilitates Reasoning Ability of LLMs[EB/OL].(2025-05-28)[2025-06-18].https://arxiv.org/abs/2505.23013.点此复制

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

评论