|国家预印本平台
首页|Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time

Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time

Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time

来源:Arxiv_logoArxiv
英文摘要

We study the approximation gap between the dynamics of a polynomial-width neural network and its infinite-width counterpart, both trained using projected gradient descent in the mean-field scaling regime. We demonstrate how to tightly bound this approximation gap through a differential equation governed by the mean-field dynamics. A key factor influencing the growth of this ODE is the local Hessian of each particle, defined as the derivative of the particle's velocity in the mean-field dynamics with respect to its position. We apply our results to the canonical feature learning problem of estimating a well-specified single-index model; we permit the information exponent to be arbitrarily large, leading to convergence times that grow polynomially in the ambient dimension $d$. We show that, due to a certain ``self-concordance'' property in these problems -- where the local Hessian of a particle is bounded by a constant times the particle's velocity -- polynomially many neurons are sufficient to closely approximate the mean-field dynamics throughout training.

Margalit Glasgow、Denny Wu、Joan Bruna

计算技术、计算机技术

Margalit Glasgow,Denny Wu,Joan Bruna.Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time[EB/OL].(2025-04-17)[2025-06-15].https://arxiv.org/abs/2504.13110.点此复制

评论