Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks
Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks
Natural-gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss--Newton update must be solved in the parameter space, incurring a prohibitive $O(n^3)$ time complexity, where $n$ is the number of network trainable weights. We show that exactly the same step can instead be formulated in a generally smaller residual space of size $m = \sum_{\gamma} N_{\gamma} d_{\gamma}$, where each residual class $\gamma$ (e.g. PDE interior, boundary, initial data) contributes $N_{\gamma}$ collocation points of output dimension $d_{\gamma}$. Building on this insight, we introduce \textit{Dual Natural Gradient Descent} (D-NGD). D-NGD computes the Gauss--Newton step in residual space, augments it with a geodesic-acceleration correction at negligible extra cost, and provides both a dense direct solver for modest $m$ and a Nystrom-preconditioned conjugate-gradient solver for larger $m$. Experimentally, D-NGD scales second-order PINN optimization to networks with up to 12.8 million parameters, delivers one- to three-order-of-magnitude lower final error $L^2$ than first-order methods (Adam, SGD) and quasi-Newton methods, and -- crucially -- enables natural-gradient training of PINNs at this scale on a single GPU.
Anas Jnini、Flavio Vella
物理学计算技术、计算机技术
Anas Jnini,Flavio Vella.Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks[EB/OL].(2025-05-27)[2025-07-03].https://arxiv.org/abs/2505.21404.点此复制
评论