A Dataset for Spatiotemporal-Sensitive POI Question Answering
A Dataset for Spatiotemporal-Sensitive POI Question Answering
Spatiotemporal relationships are critical in data science, as many prediction and reasoning tasks require analysis across both spatial and temporal dimensions--for instance, navigating an unfamiliar city involves planning itineraries that sequence locations and timing cultural experiences. However, existing Question-Answering (QA) datasets lack sufficient spatiotemporal-sensitive questions, making them inadequate benchmarks for evaluating models' spatiotemporal reasoning capabilities. To address this gap, we introduce POI-QA, a novel spatiotemporal-sensitive QA dataset centered on Point of Interest (POI), constructed through three key steps: mining and aligning open-source vehicle trajectory data from GAIA with high-precision geographic POI data, rigorous manual validation of noisy spatiotemporal facts, and generating bilingual (Chinese/English) QA pairs that reflect human-understandable spatiotemporal reasoning tasks. Our dataset challenges models to parse complex spatiotemporal dependencies, and evaluations of state-of-the-art multilingual LLMs (e.g., Qwen2.5-7B, Llama3.1-8B) reveal stark limitations: even the top-performing model (Qwen2.5-7B fine-tuned with RAG+LoRA) achieves a top 10 Hit Ratio (HR@10) of only 0.41 on the easiest task, far below human performance at 0.56. This underscores persistent weaknesses in LLMs' ability to perform consistent spatiotemporal reasoning, while highlighting POI-QA as a robust benchmark to advance algorithms sensitive to spatiotemporal dynamics. The dataset is publicly available at https://www.kaggle.com/ds/7394666.
Xiao Han、Dayan Pan、Xiangyu Zhao、Xuyuan Hu、Zhaolin Deng、Xiangjie Kong、Guojiang Shen
计算技术、计算机技术
Xiao Han,Dayan Pan,Xiangyu Zhao,Xuyuan Hu,Zhaolin Deng,Xiangjie Kong,Guojiang Shen.A Dataset for Spatiotemporal-Sensitive POI Question Answering[EB/OL].(2025-05-16)[2025-06-15].https://arxiv.org/abs/2505.10928.点此复制
评论