Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception
Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception
The rapid development of multimodal AI and Large Language Models (LLMs) has greatly enhanced real-time interaction, decision-making, and collaborative tasks. However, in wireless multi-agent scenarios, limited bandwidth poses significant challenges to exchanging semantically rich multimodal information efficiently. Traditional semantic communication methods, though effective, struggle with redundancy and loss of crucial details. To overcome these challenges, we propose a Retrieval-Augmented Multimodal Semantic Communication (RAMSemCom) framework. RAMSemCom incorporates iterative, retrieval-driven semantic refinement tailored for distributed multi-agent environments, enabling efficient exchange of critical multimodal elements through local caching and selective transmission. Our approach dynamically optimizes retrieval using deep reinforcement learning (DRL) to balance semantic fidelity with bandwidth constraints. A comprehensive case study on multi-agent autonomous driving demonstrates that our DRL-based retrieval strategy significantly improves task completion efficiency and reduces communication overhead compared to baseline methods.
Guangyuan Liu、Yinqiu Liu、Ruichen Zhang、Hongyang Du、Dusit Niyato、Zehui Xiong、Sumei Sun、Abbas Jamalipour
无线通信计算技术、计算机技术
Guangyuan Liu,Yinqiu Liu,Ruichen Zhang,Hongyang Du,Dusit Niyato,Zehui Xiong,Sumei Sun,Abbas Jamalipour.Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception[EB/OL].(2025-05-29)[2025-07-02].https://arxiv.org/abs/2505.23275.点此复制
评论