Video Foundation Models for Animal Behavior Analysis
Video Foundation Models for Animal Behavior Analysis
Computational approaches leveraging computer vision and machine learning have transformed the quantification of animal behavior from video. However, existing methods often rely on task-specific features or models, which struggle to generalize across diverse datasets and tasks. Recent advances in machine learning, particularly the emergence of vision foundation models, i.e., large-scale models pre-trained on massive, diverse visual repositories, offers a way to tackle these challenges. Here, we investigate the potential of frozen video foundation models across a range of behavior analysis tasks, including classification, retrieval, and localization. We use a single, frozen model to extract general-purpose representations from video data, and perform extensive evaluations on diverse open-sourced animal behavior datasets. Our results demonstrate that features with minimal adaptation from foundation models achieve competitive performance compared to existing methods specifically designed for each dataset, across species, behaviors, and experimental contexts. This highlights the potential of frozen video foundation models as a powerful and accessible backbone for automated behavior analysis, with the ability to accelerate research across diverse fields from neuroscience, to ethology, and to ecology.
Liu Ting、Zhao Long、Yuan Liangzhe、Seybold Bryan、Ross David A、Hu Bo、Adam Hartwig、Sun Jennifer J、Hendon David、Zhou Hao、Schroff Florian
生物科学研究方法、生物科学研究技术计算技术、计算机技术动物学
Liu Ting,Zhao Long,Yuan Liangzhe,Seybold Bryan,Ross David A,Hu Bo,Adam Hartwig,Sun Jennifer J,Hendon David,Zhou Hao,Schroff Florian.Video Foundation Models for Animal Behavior Analysis[EB/OL].(2025-03-28)[2025-05-31].https://www.biorxiv.org/content/10.1101/2024.07.30.605655.点此复制
评论