|国家预印本平台
首页|TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis

TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis

TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis

来源:Arxiv_logoArxiv
英文摘要

GPS trajectory data reveals valuable patterns of human mobility and urban dynamics, supporting a variety of spatial applications. However, traditional methods often struggle to extract deep semantic representations and incorporate contextual map information. We propose TrajSceneLLM, a multimodal perspective for enhancing semantic understanding of GPS trajectories. The framework integrates visualized map images (encoding spatial context) and textual descriptions generated through LLM reasoning (capturing temporal sequences and movement dynamics). Separate embeddings are generated for each modality and then concatenated to produce trajectory scene embeddings with rich semantic content which are further paired with a simple MLP classifier. We validate the proposed framework on Travel Mode Identification (TMI), a critical task for analyzing travel choices and understanding mobility behavior. Our experiments show that these embeddings achieve significant performance improvement, highlighting the advantage of our LLM-driven method in capturing deep spatio-temporal dependencies and reducing reliance on handcrafted features. This semantic enhancement promises significant potential for diverse downstream applications and future research in geospatial artificial intelligence. The source code and dataset are publicly available at: https://github.com/februarysea/TrajSceneLLM.

Chunhou Ji、Qiumeng Li

交通运输经济计算技术、计算机技术

Chunhou Ji,Qiumeng Li.TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis[EB/OL].(2025-06-19)[2025-07-21].https://arxiv.org/abs/2506.16401.点此复制

评论