Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
Effective visual representation learning is crucial for reinforcement learning (RL) agents to extract task-relevant information from raw sensory inputs and generalize across diverse environments. However, existing RL benchmarks lack the ability to systematically evaluate representation learning capabilities in isolation from other learning challenges. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that transforms the classic 8-tile puzzle into a visual RL task with images drawn from arbitrarily large datasets. SPGym's key innovation lies in its ability to precisely control representation learning complexity through adjustable grid sizes and image pools, while maintaining fixed environment dynamics, observation, and action spaces. This design enables researchers to isolate and scale the visual representation challenge independently of other learning components. Through extensive experiments with model-free and model-based RL algorithms, we uncover fundamental limitations in current methods' ability to handle visual diversity. As we increase the pool of possible images, all algorithms exhibit in- and out-of-distribution performance degradation, with sophisticated representation learning techniques often underperforming simpler approaches like data augmentation. These findings highlight critical gaps in visual representation learning for RL and establish SPGym as a valuable tool for driving progress in robust, generalizable decision-making systems.
Bryan L. M. de Oliveira、Luana G. B. Martins、Bruno Brandão、Murilo L. da Luz、Telma W. de L. Soares、Luckeciano C. Melo
计算技术、计算机技术
Bryan L. M. de Oliveira,Luana G. B. Martins,Bruno Brandão,Murilo L. da Luz,Telma W. de L. Soares,Luckeciano C. Melo.Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning[EB/OL].(2025-07-01)[2025-07-18].https://arxiv.org/abs/2410.14038.点此复制
评论