Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive
Stepsizes
引用本文复制引用
Zaiwei Chen.Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive
Stepsizes[EB/OL].(2025-04-25)[2025-12-13].https://arxiv.org/abs/2504.18743.学科分类
计算技术、计算机技术
评论