|国家预印本平台
首页|Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

来源:Arxiv_logoArxiv
英文摘要

Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data and rely on overparameterized models, where classical low-dimensional intuitions break down. In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large and comparable, gives rise to novel and sometimes counterintuitive behaviors. This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models such as DNNs in this regime. We introduce the concept of High-dimensional Equivalent, which unifies and generalizes both Deterministic Equivalent and Linear Equivalent, to systematically address three technical challenges: high dimensionality, nonlinearity, and the need to analyze generic eigenspectral functionals. Leveraging this framework, we provide precise characterizations of the training and generalization performance of linear models, nonlinear shallow networks, and deep networks. Our results capture rich phenomena, including scaling laws, double descent, and nonlinear learning dynamics, offering a unified perspective on the theoretical understanding of deep learning in high dimensions.

Zhenyu Liao、Michael W. Mahoney

计算技术、计算机技术

Zhenyu Liao,Michael W. Mahoney.Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models[EB/OL].(2025-06-16)[2025-06-30].https://arxiv.org/abs/2506.13139.点此复制

评论