|国家预印本平台
首页|Exploring the Capabilities of LLMs for IMU-based Fine-grained Human Activity Understanding

Exploring the Capabilities of LLMs for IMU-based Fine-grained Human Activity Understanding

Exploring the Capabilities of LLMs for IMU-based Fine-grained Human Activity Understanding

来源:Arxiv_logoArxiv
英文摘要

Human activity recognition (HAR) using inertial measurement units (IMUs) increasingly leverages large language models (LLMs), yet existing approaches focus on coarse activities like walking or running. Our preliminary study indicates that pretrained LLMs fail catastrophically on fine-grained HAR tasks such as air-written letter recognition, achieving only near-random guessing accuracy. In this work, we first bridge this gap for flat-surface writing scenarios: by fine-tuning LLMs with a self-collected dataset and few-shot learning, we achieved up to a 129x improvement on 2D data. To extend this to 3D scenarios, we designed an encoder-based pipeline that maps 3D data into 2D equivalents, preserving the spatiotemporal information for robust letter prediction. Our end-to-end pipeline achieves 78% accuracy on word recognition with up to 5 letters in mid-air writing scenarios, establishing LLMs as viable tools for fine-grained HAR.

Lilin Xu、Kaiyuan Hou、Xiaofan Jiang

计算技术、计算机技术

Lilin Xu,Kaiyuan Hou,Xiaofan Jiang.Exploring the Capabilities of LLMs for IMU-based Fine-grained Human Activity Understanding[EB/OL].(2025-04-01)[2025-05-22].https://arxiv.org/abs/2504.02878.点此复制

评论