首页|On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective

On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective

来源：

英文摘要

Weak-to-strong generalization, where a student model trained on imperfect labels generated by a weaker teacher nonetheless surpasses that teacher, has been widely observed but the mechanisms that enable it have remained poorly understood. In this paper, through a theoretical analysis of simple models, we uncover three core mechanisms that can drive this phenomenon. First, by analyzing ridge regression, we study the interplay between the teacher and student regularization and prove that a student can compensate for a teacher's under-regularization and achieve lower test error. We also analyze the role of the parameterization regime of the models. Second, by analyzing weighted ridge regression, we show that a student model with a regularization structure more aligned to the target, can outperform its teacher. Third, in a nonlinear multi-index setting, we demonstrate that a student can learn easy, task-specific features from the teacher while leveraging its own broader pre-training to learn hard-to-learn features that the teacher cannot capture.

作者：Behrad Moniri、Hamed Hassani

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Behrad Moniri,Hamed Hassani.On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective[EB/OL].(2025-05-23)[2025-06-24].https://arxiv.org/abs/2505.18346.点此复制

On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective

On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective

评论