首页|Convergence Analysis of the Last Iterate in Distributed Stochastic Gradient Descent with Momentum

Convergence Analysis of the Last Iterate in Distributed Stochastic Gradient Descent with Momentum

来源：

英文摘要

Distributed stochastic gradient methods are widely used to preserve data privacy and ensure scalability in large-scale learning tasks. While existing theory on distributed momentum Stochastic Gradient Descent (mSGD) mainly focuses on time-averaged convergence, the more practical last-iterate convergence remains underexplored. In this work, we analyze the last-iterate convergence behavior of distributed mSGD in non-convex settings under the classical Robbins-Monro step-size schedule. We prove both almost sure convergence and $L_2$ convergence of the last iterate, and derive convergence rates. We further show that momentum can accelerate early-stage convergence, and provide experiments to support our theory.

作者：Difei Cheng、Ruinan Jin、Hong Qiao、Bo Zhang

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Difei Cheng,Ruinan Jin,Hong Qiao,Bo Zhang.Convergence Analysis of the Last Iterate in Distributed Stochastic Gradient Descent with Momentum[EB/OL].(2025-05-16)[2025-07-16].https://arxiv.org/abs/2505.10889.点此复制

Convergence Analysis of the Last Iterate in Distributed Stochastic Gradient Descent with Momentum

Convergence Analysis of the Last Iterate in Distributed Stochastic Gradient Descent with Momentum

评论