[1] 张来平, 贺立新, 刘伟, 等. 基于非结构/混合网格的高阶精度格式研究进展[J]. 力学进展, 2013, 43(2):202-236. ZHANG L P, HE L X, LIU W, et al. Reviews of high-order methods on unstructured and hybrid grid[J]. Advances in Mechanics, 2013, 43(2):202-236(in Chinese).
[2] 周铸, 黄江涛, 黄勇, 等. CFD技术在航空工程领域的应用、挑战与发展[J]. 航空学报, 2017, 38(3):1-25. ZHOU Z, HUANG J T, HUANG Y, et al. CFD tech-nology in aeronautic engineering field:Application, challenge and development[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(3):1-25(in Chinese).
[3] NIEMEYER K E, SUNG C J. Recent progress and challenges in exploiting graphics processors in computational fluid dynamics[J]. Journal of Supercomputing, 2014, 67(2):528-564.
[4] NVIDIA. CUDA C programming guide 8.0[M]. Santa Clara:NVIDIA Corporation, 2017.
[5] FRIEDRICHS M S, EASTMAN P, VAIDYAN-ATHAN V, et al. Accelerating molecular dynamic simulation on graphics processing units[J]. Journal of Computational Chemistry, 2009, 30(6):864-872.
[6] PAULIN M, MAIRAL J, DOUZE M, et al. Convolutional patch representations for image retrieval:An unsupervised approach[J]. International Journal of Computer Vision, 2017, 121(1):149-168.
[7] KHAJEH-SAEED A, PEROT J B. Computational fluid dynamics simulations using many graphics pro-cessors[J]. Computing in Science & Engineering, 2012, 14(3):10-19.
[8] VU V T, CATS G, WOLTERS L. Graphics pro-cessing unit optimizations for the dynamics of the HIRLAM weather forecast model[J]. Concurrency & Computation Practice & Experience, 2013, 25(10):1376-1393.
[9] MIELIKAINEN J, HUANG B, HUANG H L A, et al. Improved GPU/CUDA based parallel weather and re-search forecast (WRF) single moment 5-class (WSM5) cloud microphysics[J]. IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing, 2012, 5(4):1256-1265.
[10] BRANDVIK T, PULLAN G. Acceleration of a 3D Euler solver using commodity graphics hardware[C]//46th AIAA Aerospace Sciences Meeting and Exhibit. Reston, VA:AIAA, 2008.
[11] JACOBSEN D A, THIBAULT J C, SENOCAK I. An MPI-CUDA implementation for massively parallel in-compressible flow computations on multi-GPU clus-ters[C]//48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition. Reston, VA:AIAA, 2010.
[12] CASTONGUAY P, WILLIAMS D M, VINCENT P E, et al. On the development of a high-order, multi-GPU enabled, compressible viscous flow solver for mixed unstructured grids[C]//20th AIAA Computational Fluid Dynamics Conference. Reston, VA:AIAA, 2011.
[13] EMELYANOV V N, KARPENKO A G, KOZELKOV A S, et al. Analysis of impact of general-purpose graphics processor units in supersonic flow modeling[J]. Acta Astronautica, 2017, 135(7):198-207.
[14] WATKINS J, RO MERO J, JAMESON A. Multi-GPU, implicit time stepping for high-order methods on unstructured grids[C]//46th AIAA Fluid Dynamics Conference. Reston, VA:AIAA, 2016.
[15] AISSA M, VERSTRAETE T, VUIK C. Toward a GPU-aware comparison of explicit and implicit CFD simulations on structured meshes[J]. Computers & Mathematics with Applications, 2017, 74(1):201-217.
[16] 宋慎义, 王彦棡, 刘冰, 等. 基于GPU的非结构网格CFD求解器的设计与优化[J]. 科研信息化技术与应用, 2012, 3(1):30-38. SONG S Y, WANG Y G, LIU B, et al. Design and optimization of an unstructured grid CFD solver based on GPU[J]. E-Science Technology & Application, 2012, 3(1):30-38(in Chinese).
[17] XU C F, DENG X G, ZHANG L L, et al. Collaborat-ing CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer[J]. Journal of Computational Physics, 2014, 278(23):275-297.
[18] CAO W, XU C F, WANG Z H, et al. CPU/GPU com-puting for a multi-block structured grid based high-order flow solver on a large heterogeneous system[J]. Cluster Computing, 2014, 17(2):255-270.
[19] XU C F, ZHANG L L, DENG X G, et al. Balancing CPU-GPU collaborative high-order CFD simulations on the TianHe-1A supercomputer[C]//IEEE 28th International Parallel & Distributed Processing Symposium. Piscataway, NJ:IEEE, 2014:725-734.
[20] Li D L, XU C F, WANG Y, et al. Parallelizing and optimizing large-scale 3D multi-phase flow simulations on the TianHe-2 supercomputer[J]. Concurrency and Computation:Practice and Experience, 2016, 28(5):1678-1692.
[21] MA W P, LU Z H, ZHANG J. GPU parallelization of unstructured/hybrid grid ALE multigrid unsteady solver for moving body problems[J]. Computers & Fluids, 2015, 110(5):122-135.
[22] 刘枫, 李桦, 田正雨, 等. 基于MPI+CUDA的异构并行可压缩流求解器[J]. 国防科技大学学报, 2014, 36(1):6-10. LIU F, LI H, TIAN Z Y, et al. Heterogeneous parallel compressible flow solver based on MPI+CUDA[J]. Journal of National University of Defense Technology, 2014, 36(1):6-10(in Chinese).
[23] 曹文斌, 李桦, 谢文佳, 等. 应用多GPU的可压缩湍流并行计算[J]. 国防科技大学学报, 2015, 37(3):78-83. CAO W B, LI H, XIE W J, et al. Parallel computing of compressible turbulence using multi-GPU clusters[J]. Journal of National University of Defense Technology, 2015, 37(3):78-83(in Chinese).
[24] BLAZEK J. Computational fluid dynamics:Principles and applications[M]. 3rd ed. Amsterdam:Elsevier, 2015:7-25.
[25] LIOU M S. A sequel to AUSM, part Ⅱ:AUSM+up for all speeds[J]. Journal of Computational Physics, 2006, 214(1):137-170.
[26] VAN LEER B. Towards the ultimate conservative difference scheme. V. A second-order sequel to Go-dunov's method[J]. Journal of Computational Physics, 1997, 32(1):101-136.
[27] 阎超. 计算流体力学方法及应用[M]. 北京:北京航空航天大学出版社, 2006:123-131. YAN C. The application of computational fluid dynamics method[M]. Beijing:Beihang University Press, 2006:123-131(in Chinese).
[28] JAMESON A, SCHMIDT W, TURKEL E. Numerical solution of the Euler equations by finite volume methods using Runge-Kutta time stepping schemes[C]//AIAA 14th Fluid and Plasma Dynamics Conference. Reston, VA:AIAA, 1981.
[29] BAGHAPOUR B, MCCALL A, ROY C J. Multilevel parallelism for CFD codes on heterogeneous plat-forms[C]//46th AIAA Fluid Dynamics Conference. Reston, VA:AIAA, 2016.
[30] XIA Y, LOU J, LUO H, et al. OpenACC acceleration of an unstructured CFD solver based on a reconstructed discontinuous Galerkin method for compressible flows[J]. International Journal for Numerical Methods in Fluids, 2015, 78(3):123-139.
[31] NICKOLLS J. Scalable parallel programming with CUDA introduction[J]. Queue, 2008, 6(2):1-9.
[32] 张兵, 韩景龙. 基于GPU和隐式格式的CFD并行计算方法[J]. 航空学报, 2010, 31(2):249-256. ZHANG B, HAN J L. Parallel computing methods for CFD using a GPU and implicit scheme[J]. Acta Aeronautica et Astronautica Sinica, 2010, 31(2):249-256(in Chinese).
[33] GUAN J, YAN S, JIN J M. An OpenMP-CUDA implementation of multilevel fast multipole algorithm for electromagnetic simulation on multi-GPU computing systems[J]. IEEE Transactions on Antennas & Propagation, 2013, 61(7):3607-3616.
[34] YANG C T, HUANG C L, LIN C F. Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters[J]. Computer Physics Communications, 2011, 182(1):266-269.
[35] ALONSO P, CORTINA R, MARTINEZ-ZALDIVARF J, et al. Neville elimination on multi and many-core systems:OpenMP, MPI and CUDA[J]. Journal of Supercomputing, 2011, 58(2):215-225.
[36] REINARTZ B U, HERRMANN C D, BALLMANN J, et al. Aerodynamic performance analysis of a hypersonic inlet isolator using computation and experiment[J]. Journal of Propulsion & Power, 2003, 19(5):868-875.