Neural Co-state Projection Regulator: A Model-free Paradigm for Real-time Optimal Control with Input Constraints
Neural Co-state Projection Regulator: A Model-free Paradigm for Real-time Optimal Control with Input Constraints
Learning-based approaches, notably Reinforcement Learning (RL), have shown promise for solving optimal control tasks without explicit system models. However, these approaches are often sample-inefficient, sensitive to reward design and hyperparameters, and prone to poor generalization, especially under input constraints. To address these challenges, we introduce the neural co-state projection regulator (NCPR), a model-free learning-based optimal control framework that is grounded in Pontryagin's Minimum Principle (PMP) and capable of solving quadratic regulator problems in nonlinear control-affine systems with input constraints. In this framework, a neural network (NN) is trained in a self-supervised setting to take the current state of the system as input and predict a finite-horizon trajectory of projected co-states (i.e., the co-state weighted by the system's input gain). Subsequently, only the first element of the NN's prediction is extracted to solve a lightweight quadratic program (QP). This workflow is executed in a feedback control setting, allowing real-time computation of control actions that satisfy both input constraints and first-order optimality conditions. We test the proposed learning-based model-free quadratic regulator on (1) a unicycle model robot reference tracking problem and (2) a pendulum swing-up task. For comparison, reinforcement learning is used on both tasks; and for context, a model-based controller is used in the unicycle model example. Our method demonstrates superior generalizability in terms of both unseen system states and varying input constraints, and also shows improved sampling efficiency.
Lihan Lian、Uduak Inyang-Udoh
自动化技术、自动化技术设备自动化基础理论
Lihan Lian,Uduak Inyang-Udoh.Neural Co-state Projection Regulator: A Model-free Paradigm for Real-time Optimal Control with Input Constraints[EB/OL].(2025-08-01)[2025-08-11].https://arxiv.org/abs/2508.00283.点此复制
评论