|国家预印本平台
首页|Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning

Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning

Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning

来源:Arxiv_logoArxiv
英文摘要

We introduce Unsupervised Meta-Testing with Conditional Neural Processes (UMCNP), a novel hybrid few-shot meta-reinforcement learning (meta-RL) method that uniquely combines, yet distinctly separates, parameterized policy gradient-based (PPG) and task inference-based few-shot meta-RL. Tailored for settings where the reward signal is missing during meta-testing, our method increases sample efficiency without requiring additional samples in meta-training. UMCNP leverages the efficiency and scalability of Conditional Neural Processes (CNPs) to reduce the number of online interactions required in meta-testing. During meta-training, samples previously collected through PPG meta-RL are efficiently reused for learning task inference in an offline manner. UMCNP infers the latent representation of the transition dynamics model from a single test task rollout with unknown parameters. This approach allows us to generate rollouts for self-adaptation by interacting with the learned dynamics model. We demonstrate our method can adapt to an unseen test task using significantly fewer samples during meta-testing than the baselines in 2D-Point Agent and continuous control meta-RL benchmarks, namely, cartpole with unknown angle sensor bias, walker agent with randomized dynamics parameters.

Suzan Ece Ada、Emre Ugur

10.1109/LRA.2024.3443496

计算技术、计算机技术

Suzan Ece Ada,Emre Ugur.Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning[EB/OL].(2025-06-04)[2025-07-16].https://arxiv.org/abs/2506.04399.点此复制

评论