收稿日期:
2021-12-07
修回日期:
2022-01-06
接受日期:
2022-03-24
出版日期:
2023-03-25
发布日期:
2022-03-30
通讯作者:
高锡珍
E-mail:gaoxizhen_qd@126.com
基金资助:
Xizhen GAO1,2(), Liang TANG1,2, Huang HUANG1,2
Received:
2021-12-07
Revised:
2022-01-06
Accepted:
2022-03-24
Online:
2023-03-25
Published:
2022-03-30
Contact:
Xizhen GAO
E-mail:gaoxizhen_qd@126.com
Supported by:
摘要:
围绕地外探测任务对全自主操控的需求,阐述了智能技术的引入对地外探测操控的重要意义。根据地外探测操控任务的发展现状和特点,总结出地外探测自主操控面临的挑战与难点,对现有基于深度强化学习的操控算法进行概括。以地外探测自主操控任务难点为驱动,对深度强化学习(DRL)技术在地外探测操控中的应用及成果进行了综述与分析,概括了未来地外探测自主智能操控发展中涉及的关键技术问题。
中图分类号:
高锡珍, 汤亮, 黄煌. 深度强化学习技术在地外探测自主操控中的应用与挑战[J]. 航空学报, 2023, 44(6): 26762-026762.
Xizhen GAO, Liang TANG, Huang HUANG. Deep reinforcement learning in autonomous manipulation for celestial bodies exploration: Applications and challenges[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(6): 26762-026762.
表 1
地外探测操控能力现状
国家/机构 | 地外星表无人探测器 | 探测距离/km | 探测时间 | 操控自主能力 |
---|---|---|---|---|
美国 | “索杰纳号” | 0.1 | 83 d | 激光条纹仪自主识别地形和避碰 |
“机遇号”/“勇气号” | 45.16/8.6 | 2 208 火星日/5 352 火星日 | 局部自主避障路径规划 | |
“好奇号” | 20.31 | 2 708 火星日 | 局部自主避障路径规划、特定探测目标自主识别、半自主火星岩石钻孔和分析 | |
“洞察号” | 地面规划,原位探测,火星地表钻孔和样品分析 | |||
“毅力号” | 2.67 | 248 火星日 | 一定程度智能规划和决策能力,智能选择探测目标并自主决策采样方式 | |
日本 | “隼鸟号”/隼鸟2号” | 地面规划,小行星着陆、采样和返回 | ||
欧洲航天局 | “罗塞塔号” | 地面规划,彗星登陆 | ||
中国 | “玉兔号”/“玉兔2号” | 0.11/0.4 | 3月/15月 | 局部自主避障路径规划地面任务规划和全局路径规划 |
“嫦娥”着陆器 | 自主悬停避障 | |||
“祝融号” | 1.064 | 100 d | 地面任务规划和全局路径规划 |
表 2
典型采样操作臂设计方案
年份 | 探测器 | 操作臂性能 | 任务设计 |
---|---|---|---|
2003 | “勇气号”“机遇号”[ | 5自由度,臂长约1 m,重4.2 kg,安装4组科学仪器,重约2 kg,“肩2+肘1+腕2”构型设计方案,肩部关节有俯仰和偏航2个自由度,肘部关节有1个俯仰自由度,末端采样机构有2个自由度,驱动器采用有刷直流电机。 | 机械臂上装有研磨式采样设备(Rock Abrasion Tool,RAT),岩石的研磨通过带有2个金刚石尖齿的研磨轮高速(300 r/min)转动来实现,用于磨去岩石的风化表面,暴露内部物质供车载仪器研究。 |
2008 | “凤凰号”[ | 4自由度,臂长2.35 m,铝钛合金低质量臂,“肩2+肘1+腕1”构型设计方案,肩部关节有俯仰和偏航2个自由度,肘部关节有1个俯仰自由度,末端采样机构有1个自由度,驱动器采用有刷直流电机。 | 末端有一个“铲斗”(Scoop)用于挖掘和采集松散土壤,并可通过“刮板”(Scraping Blade)刮取冻土样品。 |
2012 | “好奇号”[ | 5自由度,臂长为2.2 m,重67 kg,末端安装有5组科学仪器,重34 kg,“肩2+肘1+腕2”构型设计方案,肩部关节有俯仰和偏航2个自由度,肘部关节有1个俯仰自由度,末端采样机构有2个自由度,驱动器采用无刷直流电机。 | 机械臂能够将5个安装在转台上的科学仪器准确放置在目标点上获取样本。钻头或者取样勺获得样本后,样本将被传送至CHIMRA加工单元,通过重力、附加振动、机械臂运动,固体样本最终被投送至巡视器上各科学仪器内。 |
2020 | “毅力号”[ | 5自由度,臂长为2.2 m,末端转台安装有2组科学仪器和1台用于近距离成像的相机,“肩2+肘1+腕2”构型设计方案,重40.1 kg,肩部关节有俯仰和偏航2个自由度,肘部关节有1个俯仰自由度,末端采样机构有2个自由度,驱动器采用无刷直流电机。 | 机械臂搭载的末端执行器主要使用钻取的方式进行采样,可以对深度达l m的次表层样品进行采集,并可钻取坚硬岩石。岩石样本收集在样本管内。再将封装好后的样本管安放在样本缓存装置中。之后,将样本缓存装置存储在在轨容器中,后续任务返回地球。 |
2020 | “嫦娥5号”[ | 4自由度,臂长为3.6 m,重22 kg,月面负载能力超过30 kg,“肩2+肘1+腕1”构型设计方案,肩部关节有俯仰和偏航2个自由度,肘部关节有1个俯仰自由度,末端采样机构有1个自由度,驱动器采用无刷直流电机。 | 利用2种不同采样形式的采样器组成,可对不同指定区域浅层月壤进行铲、挖、浅钻等多种形式采样。 |
1 | GE D T, CUI P Y, ZHU S Y. Recent development of autonomous GNC technologies for small celestial body descent and landing[J]. Progress in Aerospace Sciences, 2019, 110: 100551. |
2 | FRANCIS R, ESTLIN T, DORAN G, et al. AEGIS autonomous targeting for ChemCam on Mars science laboratory: Deployment and results of initial science team use[J]. Science Robotics, 2017, 2(7): 4582. |
3 | TREBI-OLLENNU A, KIM W, ALI K, et al. Insight Mars lander robotics instrument deployment system[J]. Space Science Reviews, 2018, 214(5): 93. |
4 | 张洪华, 梁俊, 黄翔宇, 等. 嫦娥三号自主避障软着陆控制技术[J]. 中国科学: 技术科学, 2014, 44(6): 559-568. |
ZHANG H H, LIANG J, HUANG X Y, et al. Autonomous hazard avoidance control for Chang’e-3 soft landing[J]. Scientia Sinica (Technologica), 2014, 44(6): 559-568 (in Chinese). | |
5 | 任德鹏, 李青, 张正峰, 等. 嫦娥五号探测器地面试验验证技术[J]. 中国科学: 技术科学, 2021, 51(7): 778-787. |
REN D P, LI Q, ZHANG Z F, et al. Ground-test validation technologies for Chang’e-5 lunar probe[J]. Scientia Sinica (Technologica), 2021, 51(7): 778-787 (in Chinese). | |
6 | 于登云, 张哲, 泮斌峰, 等. 深空探测人工智能技术研究与展望[J]. 深空探测学报, 2020, 7(1): 11-23. |
YU D Y, ZHANG Z, PAN B F, et al. Development and trend of artificial intelligent in deep space exploration[J]. Journal of Deep Space Exploration, 2020, 7(1): 11-23 (in Chinese). | |
7 | ROTHROCK B, KENNEDY R, CUNNINGHAM C, et al. SPOC: Deep learning-based terrain classification for Mars rover missions: AIAA-2016-5539[R]. Reston: AIAA, 2016. |
8 | 吴伟仁, 周建亮, 王保丰, 等. 嫦娥三号“玉兔号”巡视器遥操作中的关键技术[J]. 中国科学: 信息科学, 2014, 44(4): 425-440. |
WU W R, ZHOU J L, WANG B F, et al. Key technologies in the teleoperation of Chang’e-3 “Jade Rabbit” rover[J]. Scientia Sinica: Informationis, 2014, 44(4): 425-440 (in Chinese). | |
9 | 胡浩, 裴照宇, 李春来, 等. 无人月球采样返回工程总体设计: 嫦娥五号任务[J]. 中国科学: 技术科学, 2021, 51(11): 1275-1286. |
HU H, PEI Z Y, LI C L, et al. Overall design of unmanned lunar sampling and return project: Chang'e-5 mission[J]. Scientia Sinica (Technologica), 2021, 51(11): 1275-1286 (in Chinese). | |
10 | TREBI-OLLENNU A, BAUMGARTNER E T, LEGER P C, et al. Robotic arm in situ operations for the Mars Exploration Rovers surface mission[C]∥2005 IEEE International Conference on Systems, Man and Cybernetics. Piscataway: IEEE Press, 2005: 1799-1806. |
11 | BAUMGARTNER E T, BONITZ R G, MELKO J P, et al. The Mars Exploration Rover instrument positioning system[C]∥2005 IEEE Aerospace Conference. Piscataway: IEEE Press, 2005: 1-19. |
12 | BONITZ R, SHIRAISHI L, ROBINSON M, et al. The phoenix Mars lander robotic arm[C]∥2009 IEEE Aerospace conference. Piscataway: IEEE Press, 2009: 1-12. |
13 | BILLING P, FLEISCHNER C. Mars science laboratory robotic arm[J]. 14th European Space Mechanisms & Tribology Symposium-ESMATS 2011. 2011: 363-370. |
14 | MOELLER R C, JANDURA L, ROSETTE K, et al. The Sampling and Caching Subsystem (SCS) for the scientific exploration of jezero crater by the Mars 2020 perseverance rover[J]. Space Science Reviews, 2021, 217(1): 1-43. |
15 | 马如奇, 姜水清, 刘宾, 等. 月球采样机械臂系统设计及试验验证[J]. 宇航学报, 2018, 39(12): 1315-1322. |
MA R Q, JIANG S Q, LIU B, et al. Design and verification of a lunar sampling manipulator system[J]. Journal of Astronautics, 2018, 39(12): 1315-1322 (in Chinese). | |
16 | ROBINSON M, COLLINS C, LEGER P, et al. In-situ operations and planning for the Mars science laboratory robotic arm: The first 200 sols[C]∥2013 8th International Conference on System of Systems Engineering. Piscataway: IEEE Press, 2013: 153-158. |
17 | CALLAS J L. Mars exploration rover spirit end of mission report[R]. Pasadena: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2015. |
18 | DLR. The InSight mission logbook (February 2019-July 2020) [EB/OL]. (2020-07-07) [2021-09-02]. . |
19 | ABBEY W, ANDERSON R, BEEGLE, et al. A look back, part II: The drilling campaign of the Curiosity rover during the Mars Science Laboratory’s second and third Martian years[J]. Icarus, 2020, 350: 113885. |
20 | NASA. Assessing perseverance’s first sample attempt[EB/OL]. (2021-08-11) [2021-09-02]. . |
21 | 孙长银, 穆朝絮. 多智能体深度强化学习的若干关键科学问题[J]. 自动化学报, 2020, 46(7): 1301-1312. |
SUN C Y, MU C X. Important scientific problems of multi-agent deep reinforcement learning[J]. Acta Automatica Sinica, 2020, 46(7): 1301-1312 (in Chinese). | |
22 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. |
23 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J]. 4th International Conference on Learning Representations, ICLR 2016-Conference Track Proceedings. OpenReview. Net, 2016: 1-14 |
24 | FUJIMOTO S, van HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[DB/OL]. arXiv preprint: 1802.09477, 2018. |
25 | POPOV I, HEESS N, LILLICRAP T, et al. Data-efficient deep reinforcement learning for dexterous manipulation[DB/OL]. arXiv preprint: 1704.03073, 2017. |
26 | FUJIMOTO S, MEGER D, PRECUP D. Off-policy deep reinforcement learning without exploration [DB/OL]. arXiv preprint: 1812.02900, 2018. |
27 | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor [DB/OL]. arXiv preprint: 1801.01290, 2018. |
28 | SCHULMAN J. Optimizing expectations: From deep reinforcement learning to stochastic computation graphs[D]. Berkeley: UC Berkeley, 2016 |
29 | SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation [DB/OL]. arXiv preprint: 1506.02438, 2015. |
30 | DUAN Y, CHEN X, HOUTHOOFT R, et al. Benchmarking deep reinforcement learning for continuous control[C]∥Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48. New York: ACM, 2016: 1329-1338. |
31 | ANTONOVA R, CRUCIANI S, SMITH C, et al. Reinforcement learning for pivoting task [DB/OL]. arXiv preprint: 1703.00472, 2017. |
32 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms [DB/OL]. arXiv preprint: 1707.06347, 2017. |
33 | HEESS N, TB D, SRIRAM S, et al. Emergence of locomotion behaviours in rich environments [DB/OL]. arXiv preprint: 1707.02286, 2017. |
34 | PENG X B, ABBEEL P, LEVINE S, et al. DeepMimic: Example-guided deep reinforcement learning of physics-based character skills [DB/OL]. arXiv preprint: 1804.02717, 2018. |
35 | TANG Y H, AGRAWAL S. Discretizing continuous action space for on-policy optimization[C]∥ Proceedings of the AAAI Conference on Artificial Intelligence. Washington, D.C.: AAAI, 2020: 5981-5988. |
36 | YIN M Z, YUE Y G, ZHOU M Y. ARSM: Augment-REINFORCE-swap-merge estimator for gradient backpropagation through categorical variables [DB/OL]. arXiv preprint: 1905.01413, 2019. |
37 | YUE Y G, TANG Y H, YIN M Z, et al. Discrete action on-policy learning with action-value critic [DB/OL]. arXiv preprint: 2002.03534, 2020. |
38 | MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]∥ Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48. New York: ACM, 2016: 1928-1937. |
39 | DEISENROTH M P, RASMUSSEN C E. PILCO: A model-based and data-efficient approach to policy search[C]∥ Proceedings of the 28th International Conference on International Conference on Machine Learning. New York: ACM, 2011: 465-472. |
40 | OH J, GUO X X, LEE H, et al. Action-conditional video prediction using deep networks in Atari games[C]∥ Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2. New York: ACM, 2015: 2863-2871. |
41 | WATTER M, SPRINGENBERG J T, BOEDECKER J, et al. Embed to control: A locally linear latent dynamics model for control from raw images [DB/OL]. arXiv preprint: 1506.07365, 2015. |
42 | HA D, SCHMIDHUBER J. World models [DB/OL]. arXiv preprint: 1803.10122, 2018. |
43 | LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies [DB/OL]. arXiv preprint: 1504.00702, 2015. |
44 | 刘乃军, 鲁涛, 蔡莹皓, 等. 机器人操作技能学习方法综述[J]. 自动化学报, 2019, 45(3): 458-470. |
LIU N J, LU T, CAI Y H, et al. A review of robot manipulation skills learning methods[J]. Acta Automatica Sinica, 2019, 45(3): 458-470 (in Chinese). | |
45 | 赵冬斌, 邵坤, 朱圆恒, 等. 深度强化学习综述: 兼论计算机围棋的发展[J]. 控制理论与应用, 2016, 33(6): 701-717. |
ZHAO D B, SHAO K, ZHU Y H, et al. Review of deep reinforcement learning and discussions on the development of computer Go[J]. Control Theory & Applications, 2016, 33(6): 701-717 (in Chinese). | |
46 | GAUDET B, FURFARO R. Adaptive pinpoint and fuel efficient Mars landing using reinforcement learning[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(4): 397-411. |
47 | CHENG L, WANG Z B, JIANG F H. Real-time control for fuel-optimal Moon landing based on an interactive deep reinforcement learning algorithm[J]. Astrodynamics, 2019, 3(4): 375-386. |
48 | GAUDET B, LINARES R. Integrated guidance and control for pinpoint Mars landing using reinforcement learning: AAS 18-290[R]. Washington, D.C.: AAS, 2018. |
49 | JIANG X Q. Integrated guidance for Mars entry and powered descent using reinforcement learning and pseudospectral method[J]. Acta Astronautica, 2019, 163: 114-129. |
50 | GAUDET B, LINARES R, FURFARO R. Deep reinforcement learning for six degree-of-freedom planetary powered descent and landing [DB/OL]. arXiv preprint: 1810.08719, 2018. |
51 | GAUDET B, LINARES R, FURFARO R. Deep reinforcement learning for six degree-of-freedom planetary landing[J]. Advances in Space Research, 2020, 65(7): 1723-1741. |
52 | SHIROBOKOV M. Survey of machine learning techniques in spacecraft control design[J]. Acta Astronautica, 2021, 186: 87-97. |
53 | 黄旭星, 李爽, 杨彬, 等. 人工智能在航天器制导与控制中的应用综述[J]. 航空学报, 2021, 42(4): 524201. |
HUANG X X, LI S, YANG B, et al. Spacecraft guidance and control based on artificial intelligence: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524201 (in Chinese). | |
54 | FURFARO R, SCORSOGLIO A, LINARES R, et al. Adaptive generalized ZEM-ZEV feedback guidance for planetary landing via a deep reinforcement learning approach [DB/OL]. arXiv preprint: 2003.02182, 2020. |
55 | FURFARO R. Adaptive generalized ZEM-ZEV feedback guidance for planetary landing via a deep reinforcement learning approach[J]. Acta Astronautica, 2020, 171: 156-171. |
56 | FURFARO R, WIBBEN D R, GAUDET B, et al. Terminal multiple surface sliding guidance for planetary landing: Development, tuning and optimization via reinforcement learning[J]. The Journal of the Astronautical Sciences, 2015, 62(1): 73-99. |
57 | IIYAMA K, TOMITA K, JAGATIA B A, et al. Deep reinforcement learning for safe landing site selection with concurrent consideration of divert maneuvers [DB/OL]. arXiv preprint: 2102.12432, 2021. |
58 | GAUDET B. Terminal adaptive guidance via reinforcement meta-learning: Applications to autonomous asteroid close-proximity operations[J]. Acta Astronautica, 2020, 171: 1-13. |
59 | GAUDET B. Adaptive guidance and integrated navigation with reinforcement meta-learning[J]. Acta Astronautica, 2020, 169: 180-190. |
60 | BATTIN R H. An introduction to the mathematics and methods of astrodynamics[M]. Reston: AIAA, 1999: 10-12. |
61 | D'SOUZA C, D'SOUZA C. An optimal guidance law for planetary landing: AIAA-1997-3709[R]. Reston: AIAA, 1997. |
62 | 周思雨, 白成超. 基于深度强化学习的行星车路径规划方法研究[J]. 无人系统技术, 2019, 2(4): 38-45. |
ZHOU S Y, BAI C C. Research on planetary rover path planning method based on deep reinforcement learning[J]. Unmanned Systems Technology, 2019, 2(4): 38-45 (in Chinese). | |
63 | SERNA J G, VANEGAS F, GONZALEZ F, et al. A review of current approaches for UAV autonomous mission planning for Mars biosignatures detection[C]∥2020 IEEE Aerospace Conference. Piscataway: IEEE Press, 2020: 1-15. |
64 | MCEWEN A S, ELIASON E M, BERGSTROM J W, et al. Mars reconnaissance orbiter’s High Resolution Imaging Science Experiment (HiRISE)[J]. Journal of Geophysical Research, 2007, 112(E5): E05S02. |
65 | TAVALLALI P, KARUMANCHI S, BOWKETT J, et al. A reinforcement learning framework for space missions in unknown environments[C]∥2020 IEEE Aerospace Conference. Piscataway: IEEE Press, 2020: 1-8. |
66 | PFLUEGER M, AGHA A, SUKHATME G S. Rover-IRL: Inverse reinforcement learning with soft value iteration networks for planetary rover path planning[J]. IEEE Robotics and Automation Letters, 2019, 4(2): 1387-1394. |
67 | HUANG Y X, WU S F, MU Z C, et al. A multi-agent reinforcement learning method for swarm robots in space collaborative exploration[C]∥2020 6th International Conference on Control, Automation and Robotics (ICCAR). Piscataway: IEEE Press, 2020 : 139-144. |
68 | WACHI A, SUI Y N. Safe reinforcement learning in constrained Markov decision processes[C]∥ Proceedings of the 37th International Conference on Machine Learning. New York: ACM, 2020: 9797-9806. |
69 | TURCHETTA M, BERKENKAMP F, KRAUSE A. Safe exploration in finite Markov decision processes with Gaussian processes[C]∥ Proceedings of the 30th International Conference on Neural Information Processing Systems. New York: ACM, 2016: 4312-4320. |
70 | WACHI A, SUI Y N, YUE Y S, et al. Safe exploration and optimization of constrained MDPs using Gaussian processes[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 52-58. |
71 | BERNSTEIN D S, ZILBERSTEIN S. Reinforcement learning for weakly-coupled MDPs and an application to planetary rover control[C]∥6th European Conference on Planning. Washington, D. C.: AAAI, 2014: 240-243. |
72 | SONG X G, GAO H B, DING L, et al. Diagonal recurrent neural networks for parameters identification of terrain based on wheel-soil interaction analysis[J]. Neural Computing and Applications, 2017, 28(4): 797-804. |
73 | ARREGUIN A L, MONTENEGRO S, DILGER E. Towards in-situ characterization of regolith strength by inverse terramechanics and machine learning: A survey and applications to planetary rovers[J]. Planetary and Space Science, 2021, 204: 105271. |
74 | 黄煌, 高锡珍, 汤亮, 等. 一种端到端的地外探测样品智能抓取方法: CN113524173A[P]. 2021-10-22. |
HUANG H, GAO X Z, TANG L, et al. End-to-end extraterrestrial detection sample intelligent grabbing method: CN113524173A[P]. 2021-10-22 (in Chinese). | |
75 | 张浩东, 吴建华. 基于深度强化学习的机器人推拨优化装箱问题研究[J]. 空间控制技术与应用, 2021, 47(6): 52-58. |
ZHANG H D, WU J H. Optimization of robotic Bin packing via pushing based on algorithm[J]. Aerospace Control and Application, 2021, 47(6): 52-58 (in Chinese). |
[1] | 张鸿林, 罗建军, 马卫华. 基于机器学习的航天器规避目标威胁博弈决策[J]. 航空学报, 2024, 45(8): 329136-329136. |
[2] | 蔡章博, 张征宇, 余皓, 占书恒. 荧光油膜速度场的自适应快速光流解法[J]. 航空学报, 2024, 45(7): 128047-128047. |
[3] | 蔡云鹏, 周大鹏, 丁江川. 具有防撞安全约束的无人机集群智能协同控制[J]. 航空学报, 2024, 45(5): 529683-529683. |
[4] | 郑志瀛, 谭鸽伟, 蒋丁一. 基于双向重采样的高分辨率前视成像算法[J]. 航空学报, 2024, 45(4): 328749-328749. |
[5] | 单圣哲, 张伟伟. 基于自博弈深度强化学习的空战智能决策方法[J]. 航空学报, 2024, 45(4): 328723-328723. |
[6] | 李昊达, 龙腾, 史人赫, 叶年辉. 基于样本映射与动态Kriging的飞行器离散连续优化方法[J]. 航空学报, 2024, 45(3): 228726-228726. |
[7] | 倪炜霖, 王永海, 徐聪, 赤丰华, 梁海朝. 基于强化学习的高超飞行器协同博弈制导方法[J]. 航空学报, 2023, 44(S2): 729400-729400. |
[8] | 王雪鉴, 文永明, 石晓荣, 张宁宁, 刘洁玺. 多智能体多耦合任务混合式智能决策架构设计[J]. 航空学报, 2023, 44(S2): 729770-729770. |
[9] | 万能, 庄其鑫, 郭彦亨, 常智勇, 王道. 拟合精度约束下航发叶片在机测量采样策略[J]. 航空学报, 2023, 44(7): 427151-427151. |
[10] | 杨国涛, 岳振江, 刘莉. 基于自适应采样的高超声速飞行器气动热全局快速预示[J]. 航空学报, 2023, 44(6): 127391-127391. |
[11] | 周攀, 黄江涛, 章胜, 刘刚, 舒博文, 唐骥罡. 基于深度强化学习的智能空战决策与仿真[J]. 航空学报, 2023, 44(4): 126731-126731. |
[12] | 唐永兴, 朱战霞, 张红文, 罗建军, 袁建平. 机器人运动规划方法综述[J]. 航空学报, 2023, 44(2): 26495-026495. |
[13] | 朱祥维, 沈丹, 肖凯, 马岳鑫, 廖祥, 古富强, 余芳文, 高柯夫, 刘经南. 类脑导航的机理、算法、实现与展望[J]. 航空学报, 2023, 44(19): 28569-028569. |
[14] | 岳敬轩, 王红茹, 朱东琴, ALEKSANDR Chupalov. 基于改进粒子滤波的无人机编队协同导航算法[J]. 航空学报, 2023, 44(14): 327995-327995. |
[15] | 董磊, 陈泓兵, 陈曦, 赵长啸. 基于DQN的单一飞行员驾驶模式分布式多智能体联盟任务分配策略[J]. 航空学报, 2023, 44(13): 327895-327895. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
版权所有 © 航空学报编辑部
版权所有 © 2011航空学报杂志社
主管单位:中国科学技术协会 主办单位:中国航空学会 北京航空航天大学