Generalization Through Growth: Hidden Dynamics Controls Depth Dependence
Generalization Through Growth: Hidden Dynamics Controls Depth Dependence
Recent theory has reduced the depth dependence of generalization bounds from exponential to polynomial and even depth-independent rates, yet these results remain tied to specific architectures and Euclidean inputs. We present a unified framework for arbitrary \blue{pseudo-metric} spaces in which a depth-\(k\) network is the composition of continuous hidden maps \(f:\mathcal{X}\to \mathcal{X}\) and an output map \(h:\mathcal{X}\to \mathbb{R}\). The resulting bound $O(\sqrt{(\alpha + \log \beta(k))/n})$ isolates the sole depth contribution in \(\beta(k)\), the word-ball growth of the semigroup generated by the hidden layers. By Gromov's theorem polynomial (resp. exponential) growth corresponds to virtually nilpotent (resp. expanding) dynamics, revealing a geometric dichotomy behind existing $O(\sqrt{k})$ (sublinear depth) and $\tilde{O}(1)$ (depth-independent) rates. We further provide covering-number estimates showing that expanding dynamics yield an exponential parameter saving via compositional expressivity. Our results decouple specification from implementation, offering architecture-agnostic and dynamical-systems-aware guarantees applicable to modern deep-learning paradigms such as test-time inference and diffusion models.
Sho Sonoda、Yuka Hashimoto、Isao Ishikawa、Masahiro Ikeda
计算技术、计算机技术
Sho Sonoda,Yuka Hashimoto,Isao Ishikawa,Masahiro Ikeda.Generalization Through Growth: Hidden Dynamics Controls Depth Dependence[EB/OL].(2025-05-20)[2025-08-02].https://arxiv.org/abs/2505.15064.点此复制
评论