大型基础模型赋能下的无人飞行器智能化进展与应用展望

  • 邵典 ,
  • 唐矗 ,
  • 昌敏 ,
  • 刘黎可 ,
  • 王雨乐 ,
  • 李浩 ,
  • 白俊强
展开
  • 1. 西北工业大学
    2. 中航工业成都飞机设计研究所

收稿日期: 2025-11-27

  修回日期: 2026-02-10

  网络出版日期: 2026-02-27

基金资助

国家自然科学基金 青年基金;无人飞行器技术全国重点实验室开放基金

Intelligent UAVs Empowered by Large Foundation Models: Progress, Applications and Perspectives

  • SHAO Dian ,
  • TANG Chu ,
  • CHANG Min ,
  • LIU Li-Ke ,
  • WANG Yu-Le ,
  • LI Hao ,
  • BAI Jun-Qiang
Expand

Received date: 2025-11-27

  Revised date: 2026-02-10

  Online published: 2026-02-27

摘要

以大语言模型、视觉语言模型与视觉基础模型为代表的大型基础模型,正推动无人飞行器智能化的新一轮演进。围绕该趋势,首先对相关模型的关键特性与通用能力进行了归纳,总结其驱动的主流具身架构,并比较不同架构在无人飞行器高动态、强约束场景下的适配性。其次,阐明了各类大型基础模型如何通过开放环境理解、任务级语义规划、具身推理控制及多模态交互等方式,重塑无人飞行器感知、规划、控制与交互四大核心功能要素。进一步聚焦大型基础模型驱动的高阶认知功能,探讨了推理、记忆、反思与想象在应对无人飞行器复杂场景下的作用机制、实现途径、技术局限及评测范式。总结了大型基础模型在视觉-语言导航、主动目标搜索、语义物流配送及集群智能协同等四类典型决策任务中的赋能模式与前沿进展。最后,深入讨论了安全风险与防护机制、工程落地与端侧部署等核心挑战及应对策略,并从高效基础智能构建、感知-认知跨越及泛在智能产业融合等方面展望了未来发展方向。

本文引用格式

邵典 , 唐矗 , 昌敏 , 刘黎可 , 王雨乐 , 李浩 , 白俊强 . 大型基础模型赋能下的无人飞行器智能化进展与应用展望[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2026.33148

Abstract

Large Foundation Models (LFMs), represented by Large Language Models, Vision–Language Models, and Vision Foundation Models, are driving a new wave of intelligent evolution for Unmanned Aerial Vehicles (UAVs). Focusing on this trend, the key characteristics and general capabilities of relevant models are first summarized, followed by a categorization of the mainstream embodied architectures driven by them. A comparison is conducted regarding the adaptability and trade-offs of different architectures within the high-dynamic and strongly constrained scenarios of UAVs. Secondly, an analysis is provided on how various LFMs reshape the four core functional elements of UAVs, including perception, planning, control, and interaction, through mechanisms such as open-world understanding, task-level semantic planning, embodied reasoning control, and multi-modal interaction. Furthermore, focusing on high-level cognitive functions driven by LFMs, the mechanisms, implementation pathways, technical limitations, and evaluation paradigms of reasoning, memory, reflection, and imagination in coping with complex UAV scenarios are discussed. The empowerment patterns and frontier progress of LFMs in four typical decision-making tasks are then summarized, including vision-language navigation, active target search, semantic delivery, and swarm intelligent coordination. Finally, core challenges regarding safety risks and protection mechanisms, engineering implementation, and edge deployment are discussed, envisioning future directions in efficient foundation intelligence, the perception-to-cognition transition, and ubiquitous industrial collaboration.
文章导航

/