WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis
WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis
Recent advancements in large language models (LLMs) have significantly improved the capabilities of web agents. However, effectively navigating complex and dynamic web environments still requires more advanced trajectory-level planning and execution. Prior studies have addressed self-improving agents by collecting extensive GUI trajectories from real-environment interactions. Despite their effectiveness, these approaches encounter two critical challenges: (1) Uncontrollable environment states, where real or sandboxed web environments often yield unstable and non-deterministic feedback, complicating the reproduction and debugging of agent behaviors; and (2) High API costs, as generating even a single interaction trajectory can involve hundreds of queries, leading to considerable API usage and computational expenses. To address these limitations and enable scalable self-improvement for agents, we propose WebSynthesis, a novel framework for trajectory synthesis and training. WebSynthesis leverages a learned world model to simulate virtual web environments, allowing a policy agent to perform efficient and reversible tree-based planning. This approach supports the large-scale generation of diverse and high-quality trajectories, which are subsequently utilized to refine the agent's policy. Experimental results demonstrate that an agent trained using WebSynthesis on a small-scale synthetic dataset achieves performance comparable to or even surpassing that of models trained on large-scale real-world data.
Yifei Gao、Junhong Ye、Jiaqi Wang、Jitao Sang
计算技术、计算机技术
Yifei Gao,Junhong Ye,Jiaqi Wang,Jitao Sang.WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis[EB/OL].(2025-07-06)[2025-07-21].https://arxiv.org/abs/2507.04370.点此复制
评论