UAVScenes: A Multi-Modal Dataset for UAVs
UAVScenes: A Multi-Modal Dataset for UAVs
Multi-modal perception is essential for unmanned aerial vehicle (UAV) operations, as it enables a comprehensive understanding of the UAVs' surrounding environment. However, most existing multi-modal UAV datasets are primarily biased toward localization and 3D reconstruction tasks, or only support map-level semantic segmentation due to the lack of frame-wise annotations for both camera images and LiDAR point clouds. This limitation prevents them from being used for high-level scene understanding tasks. To address this gap and advance multi-modal UAV perception, we introduce UAVScenes, a large-scale dataset designed to benchmark various tasks across both 2D and 3D modalities. Our benchmark dataset is built upon the well-calibrated multi-modal UAV dataset MARS-LVIG, originally developed only for simultaneous localization and mapping (SLAM). We enhance this dataset by providing manually labeled semantic annotations for both frame-wise images and LiDAR point clouds, along with accurate 6-degree-of-freedom (6-DoF) poses. These additions enable a wide range of UAV perception tasks, including segmentation, depth estimation, 6-DoF localization, place recognition, and novel view synthesis (NVS). Our dataset is available at https://github.com/sijieaaa/UAVScenes
Sijie Wang、Siqi Li、Yawei Zhang、Shangshu Yu、Shenghai Yuan、Rui She、Quanjiang Guo、JinXuan Zheng、Ong Kang Howe、Leonrich Chandra、Shrivarshann Srijeyan、Aditya Sivadas、Toshan Aggarwal、Heyuan Liu、Hongming Zhang、Chujie Chen、Junyu Jiang、Lihua Xie、Wee Peng Tay
航空
Sijie Wang,Siqi Li,Yawei Zhang,Shangshu Yu,Shenghai Yuan,Rui She,Quanjiang Guo,JinXuan Zheng,Ong Kang Howe,Leonrich Chandra,Shrivarshann Srijeyan,Aditya Sivadas,Toshan Aggarwal,Heyuan Liu,Hongming Zhang,Chujie Chen,Junyu Jiang,Lihua Xie,Wee Peng Tay.UAVScenes: A Multi-Modal Dataset for UAVs[EB/OL].(2025-07-30)[2025-08-14].https://arxiv.org/abs/2507.22412.点此复制
评论