"Hi AirStar, Guide Me to the Badminton Court."
"Hi AirStar, Guide Me to the Badminton Court."
Unmanned Aerial Vehicles, operating in environments with relatively few obstacles, offer high maneuverability and full three-dimensional mobility. This allows them to rapidly approach objects and perform a wide range of tasks often challenging for ground robots, making them ideal for exploration, inspection, aerial imaging, and everyday assistance. In this paper, we introduce AirStar, a UAV-centric embodied platform that turns a UAV into an intelligent aerial assistant: a large language model acts as the cognitive core for environmental understanding, contextual reasoning, and task planning. AirStar accepts natural interaction through voice commands and gestures, removing the need for a remote controller and significantly broadening its user base. It combines geospatial knowledge-driven long-distance navigation with contextual reasoning for fine-grained short-range control, resulting in an efficient and accurate vision-and-language navigation (VLN) capability.Furthermore, the system also offers built-in capabilities such as cross-modal question answering, intelligent filming, and target tracking. With a highly extensible framework, it supports seamless integration of new functionalities, paving the way toward a general-purpose, instruction-driven intelligent UAV agent. The supplementary PPT is available at \href{https://buaa-colalab.github.io/airstar.github.io}{https://buaa-colalab.github.io/airstar.github.io}.
Ziqin Wang、Jinyu Chen、Xiangyi Zheng、Qinan Liao、Linjiang Huang、Si Liu
航空航天技术航空
Ziqin Wang,Jinyu Chen,Xiangyi Zheng,Qinan Liao,Linjiang Huang,Si Liu."Hi AirStar, Guide Me to the Badminton Court."[EB/OL].(2025-07-06)[2025-07-25].https://arxiv.org/abs/2507.04430.点此复制
评论