|国家预印本平台
首页|ENWAR: A RAG-empowered Multi-Modal LLM Framework for Wireless Environment Perception

ENWAR: A RAG-empowered Multi-Modal LLM Framework for Wireless Environment Perception

ENWAR: A RAG-empowered Multi-Modal LLM Framework for Wireless Environment Perception

来源:Arxiv_logoArxiv
英文摘要

Large language models (LLMs) hold significant promise in advancing network management and orchestration in 6G and beyond networks. However, existing LLMs are limited in domain-specific knowledge and their ability to handle multi-modal sensory data, which is critical for real-time situational awareness in dynamic wireless environments. This paper addresses this gap by introducing ENWAR, an ENvironment-aWARe retrieval augmented generation-empowered multi-modal LLM framework. ENWAR seamlessly integrates multi-modal sensory inputs to perceive, interpret, and cognitively process complex wireless environments to provide human-interpretable situational awareness. ENWAR is evaluated on the GPS, LiDAR, and camera modality combinations of DeepSense6G dataset with state-of-the-art LLMs such as Mistral-7b/8x7b and LLaMa3.1-8/70/405b. Compared to general and often superficial environmental descriptions of these vanilla LLMs, ENWAR delivers richer spatial analysis, accurately identifies positions, analyzes obstacles, and assesses line-of-sight between vehicles. Results show that ENWAR achieves key performance indicators of up to 70% relevancy, 55% context recall, 80% correctness, and 86% faithfulness, demonstrating its efficacy in multi-modal perception and interpretation.

Abdulkadir Celik、Ahmed M. Eltawil、Mohamed Y. Selim、Daji Qiao、Ahmad M. Nazar、Asmaa Abdallah

无线通信通信遥感技术

Abdulkadir Celik,Ahmed M. Eltawil,Mohamed Y. Selim,Daji Qiao,Ahmad M. Nazar,Asmaa Abdallah.ENWAR: A RAG-empowered Multi-Modal LLM Framework for Wireless Environment Perception[EB/OL].(2024-10-08)[2025-08-02].https://arxiv.org/abs/2410.18104.点此复制

评论