首页|Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

来源：

英文摘要

Crash detection from video feeds is a critical problem in intelligent transportation systems. Recent developments in large language models (LLMs) and vision-language models (VLMs) have transformed how we process, reason about, and summarize multimodal information. This paper surveys recent methods leveraging LLMs for crash detection from video data. We present a structured taxonomy of fusion strategies, summarize key datasets, analyze model architectures, compare performance benchmarks, and discuss ongoing challenges and opportunities. Our review provides a foundation for future research in this fast-growing intersection of video understanding and foundation models.

作者：Sanjeda Akter、Ibne Farabi Shihab、Anuj Sharma

作者单位：

学科分类：综合运输

推荐引用：Sanjeda Akter,Ibne Farabi Shihab,Anuj Sharma.Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges[EB/OL].(2025-07-02)[2025-07-24].https://arxiv.org/abs/2507.02074.点此复制

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

评论