Efficient Stochastic BFGS methods Inspired by Bayesian Principles
Efficient Stochastic BFGS methods Inspired by Bayesian Principles
Quasi-Newton methods are ubiquitous in deterministic local search due to their efficiency and low computational cost. This class of methods uses the history of gradient evaluations to approximate second-order derivatives. However, only noisy gradient observations are accessible in stochastic optimization; thus, deriving quasi-Newton methods in this setting is challenging. Although most existing quasi-Newton methods for stochastic optimization rely on deterministic equations that are modified to circumvent noise, we propose a new approach inspired by Bayesian inference to assimilate noisy gradient information and derive the stochastic counterparts to standard quasi-Newton methods. We focus on the derivations of stochastic BFGS and L-BFGS, but our methodology can also be employed to derive stochastic analogs of other quasi-Newton methods. The resulting stochastic BFGS (S-BFGS) and stochastic L-BFGS (L-S-BFGS) can effectively learn an inverse Hessian approximation even with small batch sizes. For a problem of dimension $d$, the iteration cost of S-BFGS is $\bigO{d^2}$, and the cost of L-S-BFGS is $\bigO{d}$. Numerical experiments with a dimensionality of up to $30,720$ demonstrate the efficiency and robustness of the proposed method.
André Carlon、Luis Espath、Raúl Tempone
计算技术、计算机技术
André Carlon,Luis Espath,Raúl Tempone.Efficient Stochastic BFGS methods Inspired by Bayesian Principles[EB/OL].(2025-07-10)[2025-07-21].https://arxiv.org/abs/2507.07729.点此复制
评论