|国家预印本平台
首页|From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks

From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks

From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks

来源:Arxiv_logoArxiv
英文摘要

Training strategies for modern deep neural networks (NNs) tend to induce a heavy-tailed (HT) empirical spectral density (ESD) in the layer weights. While previous efforts have shown that the HT phenomenon correlates with good generalization in large NNs, a theoretical explanation of its occurrence is still lacking. Especially, understanding the conditions which lead to this phenomenon can shed light on the interplay between generalization and weight spectra. Our work aims to bridge this gap by presenting a simple, rich setting to model the emergence of HT ESD. In particular, we present a theory-informed setup for 'crafting' heavy tails in the ESD of two-layer NNs and present a systematic analysis of the HT ESD emergence without any gradient noise. This is the first work to analyze a noise-free setting, and we also incorporate optimizer (GD/Adam) dependent (large) learning rates into the HT ESD analysis. Our results highlight the role of learning rates on the Bulk+Spike and HT shape of the ESDs in the early phase of training, which can facilitate generalization in the two-layer NN. These observations shed light on the behavior of large-scale NNs, albeit in a much simpler setting.

Yaoqing Yang、Vignesh Kothapalli、Tianyu Pang、Shenyang Deng、Zongmin Liu

计算技术、计算机技术

Yaoqing Yang,Vignesh Kothapalli,Tianyu Pang,Shenyang Deng,Zongmin Liu.From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks[EB/OL].(2025-08-10)[2025-08-24].https://arxiv.org/abs/2406.04657.点此复制

评论