Wavelet Particle Methods

We combine the adaptivity of Lagrangian particle methods with the multiresolution capabilities of wavelets. The goal is the achievement of high accuracy while minimizing the amount of computational elements. We rely on the remeshing of the particle methods in order to introduce multiresolution capabilities in a systematic framework. We utilize grids of variable resolutions as dictated by the wavelet analysis of the fields carried by the particles. The key computational challenge for this formulation is the processing and the communication patterns between the computational elements.

Solution (isosurface) of 3D passive advection of a scalar field using MRAG, every block has the same number of points, which in this case is  14x14x14. Here we used 4th order Interpolating Wavelets.

Solution (isosurface) of 3D passive advection of a scalar field using MRAG, every block has the same number of points, which in this case is 14x14x14. Here we used 4th order Interpolating Wavelets.

Multi-Resolution Adaptive Grids (MRAG)

We are developing a framework, MRAG, for particle methods that rely on adaptive grids in order to maintain their regularity and to introduce multiresolution capabilities.  The goal of MRAG is to provide an intuitive and powerful set of tools for an extensive usage of adapted grids. The MRAG-user  delegates the multi-resolution machinery to MRAG and concentrates his efforts in the simulation itself.
Objectives of this effort include:

  • the exploitation of  the current hardware technology: MRAG should be able to run in parallel in a cluster, exploit the multi threading technology (multi cores) and take advantage of the massively multithreading technology (GPUs).
  • provide mappings between multi-resolution grids and multi-resolution particles.
  • support time refinement (TR), i.e. exploit the fact that small scales in space can have small scale in time (multi-level time integration).

MRAG, Multicores and GPUs

Currently MRAG is able to run generic 2D/3D grid-based simulations exploiting the multi-threading technology, GPU technology and is supporting TR. To validate the features supported by MRAG, we wrote 2D/3D client codes that simulate a passive scalar advection using an analytical velocity field, levelset reinitialization, diffusion and normal growth of a levelsets.

Two dimensional advection of a level set starting from a circle. Approximation using an upwind scheme (left) and a WENO 3rd order scheme (right).

The current implementaion of  MRAG enables simulations of advection of passive scalars and levelsets, levelset reinitializations, diffusion and compressible flows with complex boundaries.

People: Diego Rossinelli, Michael Bergdorf, Babak Hejazialhosseini

Funding: CO-ME, ETH Zurich

Publications

2015

  • D. Rossinelli, B. Hejazialhosseini, W. van Rees, M. Gazzola, M. Bergdorf, and P. Koumoutsakos, “MRAG-i2d: multi-resolution adapted grids for remeshed vortex methods on multicore architectures,” J. Comput. Phys., vol. 288, p. 1–18, 2015.

BibTeX

@article{rossinelli2015a,
author = {Diego Rossinelli and Babak Hejazialhosseini and Wim van Rees and Mattia Gazzola and Michael Bergdorf and Petros Koumoutsakos},
doi = {10.1016/j.jcp.2015.01.035},
journal = {{J. Comput. Phys.}},
month = {may},
pages = {1--18},
publisher = {Elsevier {BV}},
title = {{MRAG}-I2D: Multi-resolution adapted grids for remeshed vortex methods on multicore architectures},
url = {https://cse-lab.seas.harvard.edu/files/cse-lab/files/research_numerics_wpm_2015_rossinelli2015a.pdf},
volume = {288},
year = {2015}
}

Abstract

We present MRAG-I2D, an open source software framework, for multiresolution simulations of two-dimensional, incompressible, viscous flows on multicore architectures. The spatiotemporal scales of the flow field are captured by remeshed vortex methods enhanced by high order average-interpolating wavelets and local time-stepping. The multiresolution solver of the Poisson equation relies on the development of a novel, tree-based multipole method. MRAG-I2D implements a number of HPC strategies to map efficiently the irregular computational workload of wavelet-adapted grids on multicore nodes. The capabilities of the present software are compared to the current state-of-the-art in terms of accuracy, compression rates and time-to-solution. Benchmarks include the inviscid evolution of an elliptical vortex, flow past an impulsively started cylinder at Re = 40 – 40 000 and simulations of self-propelled anguilliform swimmers. The results indicate that the present software has the same or better accuracy than state-of-the-art solvers while it exhibits unprecedented performance in terms of time-to-solution.

2011

  • D. Rossinelli, B. Hejazialhosseini, D. G. Spampinato, and P. Koumoutsakos, “Multicore/multi-GPU accelerated simulations of multiphase compressible flows using wavelet adapted grids,” SIAM J. Sci. Comput., vol. 33, iss. 2, p. 512–540, 2011.

BibTeX

@article{rossinelli2011b,
author = {Diego Rossinelli and Babak Hejazialhosseini and Daniele G. Spampinato and Petros Koumoutsakos},
doi = {10.1137/100795930},
journal = {{SIAM J. Sci. Comput.}},
month = {jan},
number = {2},
pages = {512--540},
publisher = {Society for Industrial {\&} Applied Mathematics ({SIAM})},
title = {Multicore/Multi-{GPU} Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids},
url = {https://cse-lab.seas.harvard.edu/files/cse-lab/files/research_numerics_wpm_2011_rossinelli2011b.pdf},
volume = {33},
year = {2011}
}

Abstract

We present a computational method of coupling average interpolating wavelets with high-order finite volume schemes and its implementation on heterogeneous computer architectures for the simulation of multiphase compressible flows. The method is implemented to take advantage of the parallel computing capabilities of emerging heterogeneous multicore/multi-GPU architectures. A highly efficient parallel implementation is achieved by introducing the concept of wavelet blocks, exploiting the task-based parallelism for CPU cores, and by managing asynchronously an array of GPUs by means of OpenCL. We investigate the comparative accuracy of the GPU and CPU based simulations and analyze their discrepancy for two-dimensional simulations of shock-bubble interaction and Richtmeyer{–}Meshkov instability. The results indicate that the accuracy of the GPU/CPU heterogeneous solver is competitive with the one that uses exclusively the CPU cores. We report the performance improvements by employing up to 12 cores and 6 GPUs compared to the single-core execution. For the simulation of the shock-bubble interaction at Mach 3 with two million grid points, we observe a 100-fold speedup for the heterogeneous part and an overall speedup of 34.

2010

  • B. Hejazialhosseini, D. Rossinelli, M. Bergdorf, and P. Koumoutsakos, “High order finite volume methods on wavelet-adapted grids with local time-stepping on multicore architectures for the simulation of shock-bubble interactions,” J. Comput. Phys., vol. 229, iss. 22, p. 8364–8383, 2010.

BibTeX

@article{hejazialhosseini2010a,
author = {Babak Hejazialhosseini and Diego Rossinelli and Michael Bergdorf and Petros Koumoutsakos},
doi = {10.1016/j.jcp.2010.07.021},
journal = {{J. Comput. Phys.}},
month = {nov},
number = {22},
pages = {8364--8383},
publisher = {Elsevier {BV}},
title = {High order finite volume methods on wavelet-adapted grids with local time-stepping on multicore architectures for the simulation of shock-bubble interactions},
url = {https://cse-lab.seas.harvard.edu/files/cse-lab/files/research_numerics_wpm_2010_1_hejazialhosseini2010a.pdf},
volume = {229},
year = {2010}
}

Abstract

We present a space{–}time adaptive solver for single- and multi-phase compressible flows that couples average interpolating wavelets with high-order finite volume schemes. The solver introduces the concept of wavelet blocks, handles large jumps in resolution and employs local time-stepping for efficient time integration. We demonstrate that the inherently sequential wavelet-based adaptivity can be implemented efficiently in multicore computer architectures using task-based parallelism and introducing the concept of wavelet blocks. We validate our computational method on a number of benchmark problems and we present simulations of shock-bubble interaction at different Mach numbers, demonstrating the accuracy and computational performance of the method.

  • D. Rossinelli, B. Hejazialhosseini, M. Bergdorf, and P. Koumoutsakos, “Wavelet-adaptive solvers on multi-core architectures for the simulation of complex systems,” Concurr. Comp.-Prat. E., vol. 23, iss. 2, p. 172–186, 2010.

BibTeX

@article{rossinelli2010b,
author = {Diego Rossinelli and Babak Hejazialhosseini and Michael Bergdorf and Petros Koumoutsakos},
doi = {10.1002/cpe.1639},
journal = {{Concurr. Comp.-Prat. E.}},
month = {aug},
number = {2},
pages = {172--186},
publisher = {Wiley-Blackwell},
title = {Wavelet-adaptive solvers on multi-core architectures for the simulation of complex systems},
url = {https://cse-lab.seas.harvard.edu/files/cse-lab/files/research_numerics_wpm_2010_2_rossinelli2010b.pdf},
volume = {23},
year = {2010}
}

Abstract

We build wavelet-based adaptive numerical methods for the simulation of advection-dominated flows that develop multiple spatial scales, with an emphasis on fluid mechanics problems. Wavelet-based adaptivity is inherently sequential and in this work we demonstrate that these numerical methods can be implemented in software that is capable of harnessing the capabilities of multi-core architectures while maintaining their computational efficiency. Recent designs in frameworks for multi-core software development allow us to rethink parallelism as task-based, where parallel tasks are specified and automatically mapped onto physical threads. This way of exposing parallelism enables the parallelization of algorithms that were considered inherently sequential, such as wavelet-based adaptive simulations. In this paper we present a framework that combines wavelet-based adaptivity with the task-based parallelism. We demonstrate the promising performance obtained by simulating various physical systems on different multi-core architectures using up to 16 cores.