|国家预印本平台
首页|M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

来源:Arxiv_logoArxiv
英文摘要

Recent advancements in Multimodal Large Language Models (MLLMs), particularly through Reinforcement Learning with Verifiable Rewards (RLVR), have significantly enhanced their reasoning abilities. However, a critical gap persists: these models struggle with dynamic spatial interactions, a capability essential for real-world applications. To bridge this gap, we introduce M2-Reasoning-7B, a model designed to excel in both general and spatial reasoning. Our approach integrates two key innovations: (1) a novel data pipeline that generates 294.2K high-quality data samples (168K for cold-start fine-tuning and 126.2K for RLVR), which feature logically coherent reasoning trajectories and have undergone comprehensive assessment; and (2) a dynamic multi-task training strategy with step-wise optimization to mitigate conflicts between data, and task-specific rewards for delivering tailored incentive signals. This combination of curated data and advanced training allows M2-Reasoning-7B to set a new state-of-the-art (SOTA) across 8 benchmarks, showcasing superior performance in both general and spatial reasoning domains.

Inclusion AI、:、Fudong Wang、Jiajia Liu、Jingdong Chen、Jun Zhou、Kaixiang Ji、Lixiang Ru、Qingpei Guo、Ruobing Zheng、Tianqi Li、Yi Yuan、Yifan Mao、Yuting Xiao、Ziping Ma

计算技术、计算机技术

Inclusion AI,:,Fudong Wang,Jiajia Liu,Jingdong Chen,Jun Zhou,Kaixiang Ji,Lixiang Ru,Qingpei Guo,Ruobing Zheng,Tianqi Li,Yi Yuan,Yifan Mao,Yuting Xiao,Ziping Ma.M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning[EB/OL].(2025-07-11)[2025-07-24].https://arxiv.org/abs/2507.08306.点此复制

评论