|国家预印本平台
首页|MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation

MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation

MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation

来源:Arxiv_logoArxiv
英文摘要

The recent advancement of autonomous agents powered by Large Language Models (LLMs) has demonstrated significant potential for automating tasks on mobile devices through graphical user interfaces (GUIs). Despite initial progress, these agents still face challenges when handling complex real-world tasks. These challenges arise from a lack of knowledge about real-life mobile applications in LLM-based agents, which may lead to ineffective task planning and even cause hallucinations. To address these challenges, we propose a novel LLM-based agent framework called MapAgent that leverages memory constructed from historical trajectories to augment current task planning. Specifically, we first propose a trajectory-based memory mechanism that transforms task execution trajectories into a reusable and structured page-memory database. Each page within a trajectory is extracted as a compact yet comprehensive snapshot, capturing both its UI layout and functional context. Secondly, we introduce a coarse-to-fine task planning approach that retrieves relevant pages from the memory database based on similarity and injects them into the LLM planner to compensate for potential deficiencies in understanding real-world app scenarios, thereby achieving more informed and context-aware task planning. Finally, planned tasks are transformed into executable actions through a task executor supported by a dual-LLM architecture, ensuring effective tracking of task progress. Experimental results in real-world scenarios demonstrate that MapAgent achieves superior performance to existing methods. The code will be open-sourced to support further research.

Yi Kong、Dianxi Shi、Guoli Yang、Zhang ke-di、Chenlin Huang、Xiaopeng Li、Songchang Jin

自动化技术、自动化技术设备计算技术、计算机技术

Yi Kong,Dianxi Shi,Guoli Yang,Zhang ke-di,Chenlin Huang,Xiaopeng Li,Songchang Jin.MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation[EB/OL].(2025-07-29)[2025-08-11].https://arxiv.org/abs/2507.21953.点此复制

评论