Stochastic gradient with least-squares control variates
Stochastic gradient with least-squares control variates
The stochastic gradient descent (SGD) method is a widely used approach for solving stochastic optimization problems, but its convergence is typically slow. Existing variance reduction techniques, such as SAGA, improve convergence by leveraging stored gradient information; however, they are restricted to settings where the objective functional is a finite sum, and their performance degrades when the number of terms in the sum is large. In this work, we propose a novel approach which is well suited when the objective is given by an expectation over random variables with a continuous probability distribution. Our method constructs a control variate by fitting a linear model to past gradient evaluations using weighted discrete least-squares, effectively reducing variance while preserving computational efficiency. We establish theoretical sublinear convergence guarantees for strongly convex objectives and demonstrate the method's effectiveness through numerical experiments on random PDE-constrained optimization problems.
Fabio Nobile、Matteo Raviola、Nathan Schaeffer
计算技术、计算机技术
Fabio Nobile,Matteo Raviola,Nathan Schaeffer.Stochastic gradient with least-squares control variates[EB/OL].(2025-07-28)[2025-08-10].https://arxiv.org/abs/2507.20981.点此复制
评论