|国家预印本平台
首页|FlowReasoner: Reinforcing Query-Level Meta-Agents

FlowReasoner: Reinforcing Query-Level Meta-Agents

FlowReasoner: Reinforcing Query-Level Meta-Agents

来源:Arxiv_logoArxiv
英文摘要

This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks. The code is available at https://github.com/sail-sg/FlowReasoner.

Hongcheng Gao、Yue Liu、Yufei He、Longxu Dou、Chao Du、Zhijie Deng、Bryan Hooi、Min Lin、Tianyu Pang

计算技术、计算机技术

Hongcheng Gao,Yue Liu,Yufei He,Longxu Dou,Chao Du,Zhijie Deng,Bryan Hooi,Min Lin,Tianyu Pang.FlowReasoner: Reinforcing Query-Level Meta-Agents[EB/OL].(2025-04-21)[2025-05-28].https://arxiv.org/abs/2504.15257.点此复制

评论