An Efficient Compressible Multicomponent Flow Solver for Heterogeneous CPU/GPU Architectures

Citation:

F. Wermelinger, B. Hejazialhosseini, P. Hadjidoukas, D. Rossinelli, and P. Koumoutsakos, “An Efficient Compressible Multicomponent Flow Solver for Heterogeneous CPU/GPU Architectures,” in Proceedings of the Platform for Advanced Scientific Computing - PASC \textquotesingle16, 2016.

Abstract:

We present a solver for three-dimensional compressible multicomponent flow based on the compressible Euler equations. The solver is based on a finite volume scheme for structured grids and advances the solution using an explicit Runge-Kutta time stepper. The numerical scheme requires the computation of the flux divergence based on an approximate Riemann problem. The computation of the divergence quantity is the most expensive task in the algorithm. Our implementation takes advantage of the compute capabilities of heterogeneous CPU/GPU architectures. The computational problem is organized in subdomains small enough to be placed into the GPU memory. The compute intensive stencil scheme is offloaded to the GPU accelerator while advancing the solution in time on the CPU. Our method to implement the stencil scheme on the GPU is not limited to applications in fluid dynamics. The performance of our solver was assessed on Piz Daint, a XC30 supercomputer at CSCS. The GPU code is memory-bound and achieves a per-node performance of 462 Gflop/s, outperforming by 3.2× the multicore- based Gordon Bell winning cubism-mpcf solver [16] for the offloaded computation on the same platform. The focus of this work is on the per-node performance of the heterogeneous solver. In addition, we examine the performance of the solver across 4096 compute nodes. We present simulations for the shock-induced collapse of an aligned row of air bubbles submerged in water using 4 billion cells. Results show a final pressure amplification that is 100× stronger than the strength of the initial shock.

Publisher's Version

Last updated on 08/31/2021