|国家预印本平台
首页|FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

来源:Arxiv_logoArxiv
英文摘要

Reinforcement learning (RL) has driven significant progress in robotics, but its complexity and long training times remain major bottlenecks. In this report, we introduce FastTD3, a simple, fast, and capable RL algorithm that significantly speeds up training for humanoid robots in popular suites such as HumanoidBench, IsaacLab, and MuJoCo Playground. Our recipe is remarkably simple: we train an off-policy TD3 agent with several modifications -- parallel simulation, large-batch updates, a distributional critic, and carefully tuned hyperparameters. FastTD3 solves a range of HumanoidBench tasks in under 3 hours on a single A100 GPU, while remaining stable during training. We also provide a lightweight and easy-to-use implementation of FastTD3 to accelerate RL research in robotics.

Younggyo Seo、Carmelo Sferrazza、Haoran Geng、Michal Nauman、Zhao-Heng Yin、Pieter Abbeel

计算技术、计算机技术

Younggyo Seo,Carmelo Sferrazza,Haoran Geng,Michal Nauman,Zhao-Heng Yin,Pieter Abbeel.FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control[EB/OL].(2025-05-28)[2025-06-18].https://arxiv.org/abs/2505.22642.点此复制

评论