|国家预印本平台
首页|Quantum Algorithms for Finite-horizon Markov Decision Processes

Quantum Algorithms for Finite-horizon Markov Decision Processes

Quantum Algorithms for Finite-horizon Markov Decision Processes

来源:Arxiv_logoArxiv
英文摘要

In this work, we design quantum algorithms that are more efficient than classical algorithms to solve time-dependent and finite-horizon Markov Decision Processes (MDPs) in two distinct settings: (1) In the exact dynamics setting, where the agent has full knowledge of the environment's dynamics (i.e., transition probabilities), we prove that our $\textbf{Quantum Value Iteration (QVI)}$ algorithm $\textbf{QVI-1}$ achieves a quadratic speedup in the size of the action space $(A)$ compared with the classical value iteration algorithm for computing the optimal policy ($π^{*}$) and the optimal V-value function ($V_{0}^{*}$). Furthermore, our algorithm $\textbf{QVI-2}$ provides an additional speedup in the size of the state space $(S)$ when obtaining near-optimal policies and V-value functions. Both $\textbf{QVI-1}$ and $\textbf{QVI-2}$ achieve quantum query complexities that provably improve upon classical lower bounds, particularly in their dependences on $S$ and $A$. (2) In the generative model setting, where samples from the environment are accessible in quantum superposition, we prove that our algorithms $\textbf{QVI-3}$ and $\textbf{QVI-4}$ achieve improvements in sample complexity over the state-of-the-art (SOTA) classical algorithm in terms of $A$, estimation error $(ε)$, and time horizon $(H)$. More importantly, we prove quantum lower bounds to show that $\textbf{QVI-3}$ and $\textbf{QVI-4}$ are asymptotically optimal, up to logarithmic factors, assuming a constant time horizon.

Bin Luo、Yuwen Huang、Jonathan Allcock、Xiaojun Lin、Shengyu Zhang、John C. S. Lui

计算技术、计算机技术

Bin Luo,Yuwen Huang,Jonathan Allcock,Xiaojun Lin,Shengyu Zhang,John C. S. Lui.Quantum Algorithms for Finite-horizon Markov Decision Processes[EB/OL].(2025-08-07)[2025-08-24].https://arxiv.org/abs/2508.05712.点此复制

评论