非结构CFD软件MPI+OpenMP混合并行及超大规模非定常并行计算的应用

doi:10.7527/S1000-6893.2020.23859

Abstract

Abstract: In conventional engineering applications, the computational cost of unsteady flow simulation such as store separation is massive, and becomes even larger if higher accuracy is desired via refining grids or adopting higher order methods. Consequently, unsteady flow simulation is both time-consuming and expensive in CFD engineering applications. Therefore, it is necessary to improve the scalability and efficiency of unsteady flow simulation. To achieve the potential of multi-core CPU processors with both distributed and shared memories, Message Passing Interface (MPI) and OpenMP are adopted for inter-node communication and intra-node shared memory, respectively. This paper firstly implements the MPI+OpenMP hybrid parallelization, both coarse-grain and fine-grain, in our in-house code HyperFLOW. The Common Research Model (CRM) with about 40 million unstructured grid cells is employed to test the implementation on an in-house cluster. The results show that coarse-grain hybrid parallelization is superior at small scales and reaches the highest efficiency at 16 threads, whereas fine-grain is more suitable for large scale parallelization and reaches the highest efficiency at 8 threads. In addition, unstructured overset grids with 0.36 billion cells and 2.88 billion cells are generated for the wing store separation standard model. It only takes dozens of seconds to read the massive grids and complete the overset grids assembly by adopting the P2P (peer to peer) grid reading mode and the optimized overset implicit assembly method. The unsteady store separation process is simulated and parallel efficiency is calculated. The parallel efficiency of 12 288 cores is 90% (based on 768 cores) on the in-house cluster and 70% (based on 384 cores) on the Tianhe 2 supercomputer when 0.36 billion cells are used. The numerical 6 DOF (degree of freedom) results agree well with the experimental data. Finally, for the grid with 2.88 billion cells, parallel efficiency tests are conducted with 4.9×10⁴ CPU cores on the in-house cluster, and the results show that the parallel efficiency reaches 55.3% (based on 4 096 cores).

Key words: MPI+OpenMP hybrid parallelization, parallel efficiency, computational fluid dynamics, overset grids, unsteady simulation

CLC Number:

V211.3

WANG Nianhua, CHANG Xinghua, ZHAO Zhong, ZHANG Laiping. Implementation of hybrid MPI+OpenMP parallelization on unstructured CFD solver and its applications in massive unsteady simulations[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(10): 123859-123859.

References

[1] 周铸, 黄江涛, 黄勇, 等. CFD技术在航空工程领域的应用、挑战与发展[J].航空学报, 2017, 38(3):020891. ZHOU Z, HUANG J T, HUANG Y, et al. CFD technology in aeronautic engineering field:Applications, challenges and development[J].Acta Aeronautica et Astronautica Sinica, 2017, 38(3):020891(in Chinese).
[2] 阎超, 于剑, 徐晶磊, 等. CFD模拟方法的发展成就与展望[J].力学进展, 2011, 41(5):562-589. YAN C, YU J, XU J L, et al. On the achievements and prospects for the methods of computational fluid dynamics[J].Advances in Mechanics, 2011, 41(5):562-589(in Chinese).
[3] BLAZEK J. Computational fluid dynamics, principles and applications[M]. 3rd ed. Elsevier, 2015.
[4] 张来平, 邓小刚, 何磊, 等. E级计算给CFD带来的机遇与挑战[J].空气动力学学报, 2016, 34(4):405-417. ZHANG L P, DENG X G, HE L, et al. The opportunity and grand challenges in computational fluid dynamics by exascale computing[J].Acta Aerodynamica Sinica, 2016, 34(4):405-417(in Chinese).
[5] 陈国良, 孙广中, 徐云, 等. 并行计算的一体化研究现状与发展趋势[J].科学通报, 2009, 54(8):1043-1049. CHEN G L, SUN G Z, XU Y, et al. Integrated research of parallel computing:Status and future[J].Chinese Science Bulletin, 2009, 54(8):1043-1049(in Chinese).
[6] 王涛."天河二号"超级计算机[J].科学, 2013, 65(4):52. WANG T. "Tianhe 2" supercomputer[J].Science, 2013, 65(4):52(in Chinese).
[7] 张云泉. 2015年中国高性能计算机发展现状分析与展望[J].科研信息化技术与应用, 2015, 6(6):83-92. ZAHNG Y Q. State-of-art analysis and perspectives of 2015 China HPC[J].E-science Technology & Application, 2015, 6(6):83-92(in Chinese).
[8] 杨广文, 赵文来, 丁楠, 等. "神威·太湖之光"及其应用系统[J].科学, 2017, 69(3):12-16. YANG G W, ZHAO W L, DING N, et al. "Sunway TaihuLight" supercomputer and its application systems[J].Science, 2017, 69(3):12-16(in Chinese).
[9] 张云泉. 2018年中国高性能计算机发展现状分析与展望[J].计算机科学, 2019, 46(1):1-5. ZHANG Y Q. State-of-the-art analysis and perspectives of 2018 China HPC development[J].Computer Science, 2019, 46(1):1-5(in Chinese).
[10] TINOCO E N, BRODERSEN O P, KEYE S, et al. Summary of data from the sixth AIAA CFD Drag Prediction Workshop:CRM Cases 2 to 5:AIAA-2017-1208[R]. Reston:AIAA, 2017.
[11] HE X, ZHAO Z, MA R, et al. Validation of HyperFLOW in subsonic and transonic flow[J].Acta Aerodynamica Sinica, 2016, 34(2):267-275.
[12] HE X, HE X Y, HE L, et al. HyperFLOW:A structured/unstructured hybrid integrated computational environment for multi-purpose fluid simulation[J].Procedia Engineering, 2015, 126:645-649.
[13] 赵钟, 张来平, 何磊, 等. 适用任意网格的大规模并行CFD计算框架PHengLEI[J].计算机学报, 2019, 42(11):2368-2383. ZHAO Z, ZHANG L P, HE L, et al. PHengLEI:A large scale parallel CFD framework for arbitrary grids[J].Chinese Journal of Computers, 2019, 42(11):2368-2383(in Chinese).
[14] 王年华, 李明, 张来平. 非结构网格二阶有限体积法中黏性通量离散格式精度分析与改进[J].力学学报, 2018, 50(3):527-537. WANG N H, LI M, ZHANG L P. Accuracy analysis and improvement of viscous flux schemes in unstructured second-order finite-volume discretization[J].Chinese Journal of Theoretical and Applied Mechanics, 2018, 50(3):527-537(in Chinese).
[15] CHAPMAN B, JOST G, VAN DER PAS R. Using OpenMP, portable shared memory parallel programming[M]. Cambridge:The MIT Press, 2010:115-118.
[16] ZHAO Z, ZHANG Y, HE L, et al. A large-scale parallel hybrid grid generation technique for realistic complex geometry[J].International Journal for Numerical Methods in Fluids, 2020(in Press)
[17] 常兴华, 马戎, 张来平. 并行化非结构重叠网格隐式装配技术[J].航空学报, 2018, 39(6):48-58. CHANG X H, MA R, ZHANG L P. Parallel implicit hole-cutting method for unstructured overset grid[J].Acta Aeronautica et Astronautica Sinica, 2018, 39(6):48-58(in Chinese).
[18] 常兴华, 王年华, 马戎, 等. 并行重叠/变形混合网格生成技术及其应用[J].气体物理, 2019, 4(6):12-21. CHANG X H, WANG N H, MA R, et al. Dynamic hybrid mesh generator coupled with overset and deformation in parallel environment[J].Physics of Gases, 2019, 4(6):12-21(in Chinese).
[19] CHANG X H, MA R, WANG N H, et al. A parallel implicit hole-cutting method based on background mesh for unstructured chimera grid[J].Computers and Fluids, 2020, 198:104403.
[20] ZHANG L P, CHANG X H, MA R, et al. A CFD-based numerical virtual flight simulator and its application in control law design of a maneuverable missile model[J].Chinese Journal of Aeronautics, 2019, 32(12):2577-2591.
[21] HALL L H, PARTHASARATHY V. Validation of an automated Chimera/6-DOF methodology for multiple moving body problems[C]//36th AIAA Aerospace Sciences Meeting and Exhibit. Reston:AIAA, 1998.
[22] 赵钟, 何磊, 张健, 等. 湍流模拟壁面距离MPI/OpenMP混合并行计算方法[J].空气动力学学报, 2019, 37(6):883-892. ZHAO Z, HE L, ZHANG J, et al. MPI/OpenMP hybrid parallel computation of wall distance for turbulence flow simulations[J].Acta Aerodynamica Sinica, 2019, 37(6):883-892(in Chinese).
[23] 王刚, 曾铮, 叶正寅. 混合非结构网格下壁面最短距离的快速计算方法[J].西北工业大学学报, 2014(4):511-516. WANG G, ZENG Z, YE Z Y. An efficient search algorithm for calculating minimum wall distance of unstructured mesh[J].Journal of Northwestern Polytechnical University, 2014(4):511-516(in Chinese).
[24] XU C F, DENG X G, ZHANG L L, et.al. Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer[J].Journal of Computational Physics, 2014, 278:275-297.

Implementation of hybrid MPI+OpenMP parallelization on unstructured CFD solver and its applications in massive unsteady simulations

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Jianling QIAO, Zhonghua HAN, Yulin DING, Wenping SONG, Bifeng SONG. Effects of stratified atmospheric turbulence on farfield sonic boom propagation [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(2): 626350-626350.
[2]	XIAHOU Tangfan, CHEN Jiangtao, SHAO Zhidong, WU Xiaojun, LIU Yu. Model validation metrics for CFD numerical simulation under aleatory and epistemic uncertainty [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(8): 25716-025716.
[3]	CHEN Guangqiang, DOU Guohui, WEI Haogong, ZOU Xin, LI Qi, LIU Zhou, ZHOU Weijiang. Air data sensing technology of Mars probe [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(3): 626619-626619.
[4]	YI Jianmiao, DENG Feng, QIN Ning, LIU Xueqiang. Fast prediction of transonic flow field using deep learning method [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(11): 526747-526747.
[5]	WANG Di, QIAN Zhansen, LENG Yan. High-order scheme discretization of sonic boom propagation model based on augmented Burgers equation [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(1): 124916-124916.
[6]	BAO Jun, WANG Yu, NIU Qian, ZHU Xidong, CHENG Jianjie. Influencing parameters and film flow mechanism of spray droplet impacting liquid film [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(S1): 726360-726360.
[7]	CHEN Jianqiang, WU Xiaojun, ZHANG Jian, LI Bin, JIA Hongyin, ZHOU Naichun. FlowStar: General unstructured-grid CFD software for National Numerical Windtunnel (NNW) Project [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(9): 625739-625739.
[8]	GUO Yongheng, JIANG Xiong, XIAO Zhongyun, WANG Ziwei, LU Fengshun. An automatic parallel and implicit assembly method for overset grid [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(6): 124369-124369.
[9]	LI Zuobiao, WEN Fengbo, TANG Xiaolei, SU Liangjun, WANG Songtao. Prediction of single-row hole film cooling performance based on deep learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524331-524331.
[10]	YANG Xiaofeng, LI Qin, DU Yanxia, LIU Lei, GUI Yewei. Progress in numerical research on interface heterogeneous catalysis of hypersonic vehicles [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(12): 625908-625908.
[11]	CHANG Siyuan, BAI Xiaozheng, LIU Jun. A two-dimensional shock wave pattern recognition algorithm based on cluster analysis [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(8): 123626-123626.
[12]	XIANG Huan, YANG Yingkai, XIE Jinrui, WU Yongsheng. Inlet aerodynamic characteristics of fighter under high angle of attack and post-stall maneuver [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(6): 523460-523460.
[13]	HAN Zhonghua, XU Chenzhou, QIAO Jianling, LIU Fei, CHI Jiangbo, MENG Guanyu, ZHANG Keshi, SONG Wenping. Recent progress of efficient global aerodynamic shape optimization using surrogate-based approach [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(5): 623344-623344.
[14]	CHANG Siyuan, BAI Xiaozheng, CUI Xiaoqiang, LIU Jun. An improved unsteady shock-fitting algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(2): 123498-123498.
[15]	YANG Chunling, ZHANG Zhendong, ZHANG Tongyiyu. Infrared decoy modeling method based on enhanced discrete phase model and chemical combustion [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(12): 123682-123682.