首页|On the Robustness of Derivative-free Methods for Linear Quadratic Regulator

On the Robustness of Derivative-free Methods for Linear Quadratic Regulator

来源：

英文摘要

Policy optimization has drawn increasing attention in reinforcement learning, particularly in the context of derivative-free methods for linear quadratic regulator (LQR) problems with unknown dynamics. This paper focuses on characterizing the robustness of derivative-free methods for solving an infinite-horizon LQR problem. To be specific, we estimate policy gradients by cost values, and study the effect of perturbations on the estimations, where the perturbations may arise from function approximations, measurement noises, etc. We show that under sufficiently small perturbations, the derivative-free methods converge to any pre-specified neighborhood of the optimal policy. Furthermore, we establish explicit bounds on the perturbations, and provide the sample complexity for the perturbed derivative-free methods.

作者：Weijian Li、Panagiotis Kounatidis、Zhong-Ping Jiang、Andreas A. Malikopoulos

作者单位：

学科分类：自动化基础理论计算技术、计算机技术

推荐引用：Weijian Li,Panagiotis Kounatidis,Zhong-Ping Jiang,Andreas A. Malikopoulos.On the Robustness of Derivative-free Methods for Linear Quadratic Regulator[EB/OL].(2025-06-14)[2025-07-25].https://arxiv.org/abs/2506.12596.点此复制

On the Robustness of Derivative-free Methods for Linear Quadratic Regulator

On the Robustness of Derivative-free Methods for Linear Quadratic Regulator

评论