Publications

2013
P. Koumoutsakos, I. Pivkin, and F. Milde, “The fluid mechanics of cancer and its therapy,” Annual Review of Fluid Mechanics, vol. 45, no. 1, pp. 325–355, 2013. Publisher's VersionAbstract
Fluid mechanics is involved in the growth, progression, metastasis, and therapy of cancer. Blood vessels transport oxygen and nutrients to cancerous tissues, provide a route for metastasizing cancer cells to distant organs, and deliver drugs to tumors. The irregular and leaky tumor vasculature is responsible for increased interstitial pressure in the tumor microenvironment, whereas multiscale flow-structure interaction processes control tumor growth, metastasis, and nanoparticle-mediated drug delivery. We outline these flow-mediated processes, along with related experimental and computational methods for the diagnosis, predictive modeling, and therapy of cancer.
The fluid mechanics of cancer and its therapy
2012
B. Hejazialhosseini, D. Rossinelli, C. Conti, and P. Koumoutsakos, “High throughput software for direct numerical simulations of compressible two-phase flows,” in International Conference for High Performance Computing, Networking, Storage and Analysis – SC '12, 2012. Publisher's VersionAbstract
We present an open source, object-oriented software for high throughput Direct Numerical Simulations of compressible, two-phase flows. The Navier-Stokes equations are discretized on uniform grids using high order finite volume methods. The software exploits recent CPU micro-architectures by explicit vectorization and adopts NUMA-aware techniques as well as data and computation reordering. We report a compressible flow solver with unprecedented fractions of peak performance: 45% of the peak for a single node (nominal performance of 840 GFLOP/s) and 30% for a cluster of 47'000 cores (nominal performance of 0.8 PFLOP/s). We suggest that the present work may serve as a performance upper bound, regarding achievable GFLOP/s, for two-phase flow solvers using adaptive mesh refinement. The software enables 3D simulations of shock-bubble interaction including, for the first time, effects of diffusion and surface tension, by efficiently employing two hundred billion computational elements.
F. Milde, D. Franco, A. Ferrari, V. Kurtcuoglu, D. Poulikakos, and P. Koumoutsakos, “Cell Image Velocimetry (CIV): boosting the automated quantification of cell migration in wound healing assays,” Integr. Biol. vol. 4, no. 11, pp. 1437, 2012. Publisher's VersionAbstract
Cell migration is commonly quantified by tracking the speed of the cell layer interface in wound healing assays. This quantification is often hampered by low signal to noise ratio, in particular when complex substrates are employed to emulate in vivo cell migration in geometrically complex environments. Moreover, information about the cell motion, readily available inside the migrating cell layers, is not usually harvested. We introduce Cell Image Velocimetry (CIV), a combination of cell layer segmentation and image velocimetry algorithms, to drastically enhance the quantification of cell migration by wound healing assays. The resulting software analyses the speed of the interface as well as the detailed velocity field inside the cell layers in an automated fashion. CIV is shown to be highly robust for images with low signal to noise ratio, low contrast and frame shifting and it is portable across various experimental settings. The modular design and parametrization of CIV is not restricted to wound healing assays and allows for the exploration and quantification of flow phenomena in any optical microscopy dataset. Here, we demonstrate the capabilities of CIV in wound healing assays over topographically engineered surfaces and quantify the relative merits of differently aligned gratings on cell migration.
J. H. Walther, M. Praprotnik, E. M. Kotsalis, and P. Koumoutsakos, “Multiscale simulation of water flow past a C540 fullerene,” J. Comput. Phys. vol. 231, no. 7, pp. 2677–2681, 2012. Publisher's VersionAbstract
We present a novel, three-dimensional, multiscale algorithm for simulations of water flow past a fullerene. We employ the Schwarz alternating overlapping domain method to couple molecular dynamics (MD) of liquid water around the C540 buckyball with a Lattice–Boltzmann (LB) description for the Navier–Stokes equations. The proposed method links the MD and LB domains using a fully three-dimensional interface and coupling of velocity gradients. The present overlapping domain method implicitly preserves the flux of mass and momentum and bridges flux-based and Schwarz domain decomposition algorithms. We use this method to determine the slip length and hydrodynamic radius for water flow past a buckyball.
M. Paolucci, et al., “Towards a living earth simulator,” Eur. Phys. J.-Spec. Top. vol. 214, no. 1, pp. 77–108, 2012. Publisher's VersionAbstract
The Living Earth Simulator (LES) is one of the core components of the FuturICT architecture. It will work as a federation of methods, tools, techniques and facilities supporting all of the FuturICT simulation-related activities to allow and encourage interactive exploration and understanding of societal issues. Society-relevant problems will be targeted by leaning on approaches based on complex systems theories and data science in tight interaction with the other components of FuturICT. The LES will evaluate and provide answers to real-world questions by taking into account multiple scenarios. It will build on present approaches such as agent-based simulation and modeling, multiscale modelling, statistical inference, and data mining, moving beyond disciplinary borders to achieve a new perspective on complex social systems.
W. M. van Rees, F. Hussain, and P. Koumoutsakos, “Vortex tube reconnection at Re = 10^4,” Phys. Fluids, vol. 24, no. 7, pp. 075105, 2012. Publisher's VersionAbstract
We present simulations of the long-time dynamics of two anti-parallel vortex tubes with and without initial axial flow, at Reynolds number Re = \Gamma/ν = 104. Simulations were performed in a periodic domain with a remeshed vortex method using 785 × 106 particles. We quantify the vortex dynamics of the primary vortex reconnection that leads to the formation of elliptical rings with axial flow and report for the first time a subsequent collision of these rings. In the absence of initial axial flow, a -5/3 slope of the energy spectrum is observed during the first reconnection of the tubes. The resulting elliptical vortex rings experience a coiling of their vortex lines imparting an axial flow inside their cores. These rings eventually collide, exhibiting a -7/3 slope of the energy spectrum. Studies of vortex reconnection with an initial axial flow exhibit also the -7/3 slope during the initial collision as well as in the subsequent collision of the ensuing elliptical vortex rings. We quantify the detailed vortex dynamics of these collisions and examine the role of axial flow in the breakup of vortex structures.
P. Angelikopoulos, C. Papadimitriou, and P. Koumoutsakos, “Bayesian uncertainty quantification and propagation in molecular dynamics simulations: A high performance computing framework,” J. Chem. Phys. vol. 137, no. 14, pp. 144103, 2012. Publisher's VersionAbstract
ent a Bayesian probabilistic framework for quantifying and propagating the uncertainties in the parameters of force fields employed in molecular dynamics (MD) simulations. We propose a highly parallel implementation of the transitional Markov chain Monte Carlo for populating the posterior probability distribution of the MD force-field parameters. Efficient scheduling algorithms are proposed to handle the MD model runs and to distribute the computations in clusters with heterogeneous architectures. Furthermore, adaptive surrogate models are proposed in order to reduce the computational cost associated with the large number of MD model runs. The effectiveness and computational efficiency of the proposed Bayesian framework is demonstrated in MD simulations of liquid and gaseous argon.
M. Gazzola, W. M. van Rees, and P. Koumoutsakos, “C-start: optimal start of larval fish,” J. Fluid Mech. vol. 698, pp. 5–18, 2012. Publisher's VersionAbstract
We investigate the C-start escape response of larval fish by combining flow simulations using remeshed vortex methods with an evolutionary optimization. We test the hypothesis of the optimality of C-start of larval fish by simulations of larval-shaped, two- and three-dimensional self-propelled swimmers. We optimize for the distance travelled by the swimmer during its initial bout, bounding the shape deformation based on the larval mid-line curvature values observed experimentally. The best motions identified within these bounds are in good agreement with in vivo experiments and show that C-starts do indeed maximize escape distances. Furthermore we found that motions with curvatures beyond the ones experimentally observed for larval fish may result in even larger escape distances. We analyse the flow field and find that the effectiveness of the C-start escape relies on the ability of pronounced C-bent body configurations to trap and accelerate large volumes of fluid, which in turn correlates with large accelerations of the swimmer.
M. Gazzola, C. Mimeau, A. A. Tchieu, and P. Koumoutsakos, “Flow mediated interactions between two cylinders at finite Re numbers,” Phys. Fluids, vol. 24, no. 4, pp. 043103, 2012. Publisher's VersionAbstract
We present simulations of two interacting moving cylinders immersed in a twodimensional incompressible, viscous flow. Simulations are performed by coupling a wavelet-adapted, remeshed vortex method with the Brinkman penalization and projection approach. This method is validated on benchmark problems and applied to simulations of a master-slave pair of cylinders. The master cylinder's motion is imposed and the slave cylinder is let free to respond to the flow.We study the relative role of viscous and inertia effects in the cylinders interactions and identify related sharp transitions in the response of the slave. The observed differences in the behavior of cylinders with respect to corresponding potential flow simulations are discussed. In addition, it is observed that in certain situations the finite size of the slave cylinders enhances the transport so that the cylinders are advected more effectively than passive tracers placed, respectively, at the same starting position.
C. Conti, D. Rossinelli, and P. Koumoutsakos, “GPU and APU computations of Finite Time Lyapunov Exponent fields,” J. Comput. Phys. vol. 231, no. 5, pp. 2229–2244, 2012. Publisher's VersionAbstract
We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
S. Haider, et al., “PIV study of the effect of piston position on the in-cylinder swirling flow during the scavenging process in large two-stroke marine diesel engines,” Journal of Marine Science and Technology, vol. 18, no. 1, pp. 133–143, 2012. Publisher's VersionAbstract
A simplified model of a low speed large two- stroke marine diesel engine cylinder is developed. The effect of piston position on the in-cylinder swirling flow during the scavenging process is studied using the stereo- scopic particle image velocimetry technique. The mea- surements are conducted at different cross-sectional planes along the cylinder length and at piston positions covering the air intake port by 0, 25, 50 and 75 %. When the intake port is fully open, the tangential velocity profile is similar to a Burgers vortex, whereas the axial velocity has a wake- like profile. Due to internal wall friction, the swirl decays downstream, and the size of the vortex core increases. For increasing port closures, the tangential velocity profile changes from a Burgers vortex to a forced vortex, and the axial velocity changes correspondingly from a wake-like profile to a jet-like profile. For piston position with 75 % intake port closure, the jet-like axial velocity profile at a cross-sectional plane close to the intake port changes back to a wake-like profile at the adjacent downstream cross-sectional plane. This is characteristic of a vortex breakdown. The non-dimensional velocity profiles show no significant variation with the variation in Reynolds number.
PIV study of the effect of piston position on the in-cylinder swirling flow during the scavenging process in large two-stroke marine diesel engines
2011
D. Rossinelli, L. Chatagny, and P. Koumoutsakos, “Evolutionary optimization of scalar transport in cylinder arrays on multiGPU/multicore architectures,” in Proceedings of EUROGEN 2011, 2011, pp. 773-784. Publisher's VersionAbstract
We optimize a scalar transported by the flow field past an array of cylinders as a model of feeding behavior in swimming schools. We use an evolutionary optimization approach to identify cylinder arrangements that maximize the scalar field in the vicinity of twelve cylinders at ReD = 100. The large number of evaluations required by the Evolutionary Algorithms are compensated by GPU/multicore implementation of particle based flow solvers. We use an FTLE analysis of the flow field to quantify optimal transport and discuss the physical mechanisms associated with optimal configurations.
B. Bayati, P. Chatelain, and P. Koumoutsakos, “Adaptive mesh refinement for stochastic reaction–diffusion processes,” J. Comput. Phys. vol. 230, no. 1, pp. 13–26, 2011. Publisher's VersionAbstract
We present an algorithm for Adaptive Mesh Refinement applied to mesoscopic stochastic simulations of spatially evolving reaction-diffusion processes. The transition rates for the diffusion process are derived on adaptive, locally refined structured meshes. Convergence of the diffusion process is presented and the fluctuations of the stochastic process are verified. Furthermore, a refinement criterion is proposed for the evolution of the adaptive mesh. The method is validated in simulations of reaction-diffusion processes as described by the Fisher-Kolmogorov and Gray-Scott equations.
Adaptive mesh refinement for stochastic reaction–diffusion processes
G. Schwank, G. Tauriello, R. Yagi, E. Kranz, P. Koumoutsakos, and K. Basler, “Antagonistic Growth Regulation by Dpp and Fat Drives Uniform Cell Proliferation,” Dev. Cell, vol. 20, no. 1, pp. 123–130, 2011. Publisher's VersionAbstract
We use the Dpp morphogen gradient in the Drosophila wing disc as a model to address the fundamental question of how a gradient of a growth factor can produce uniform growth. We first show that proper expression and subcellular localization of components in the Fat tumor-suppressor pathway, which have been argued to depend on Dpp activity differences, are not reliant on the Dpp gradient. We next analyzed cell proliferation in discs with uniformly high Dpp or uniformly low Fat signaling activity and found that these pathways regulate growth in a complementary manner. While the Dpp mediator Brinker inhibits growth in the primordium primarily in the lateral regions, Fat represses growth mostly in the medial region. Together, our results indicate that the activities of both signaling pathways are regulated in a parallel rather than sequential manner and that uniform proliferation is achieved by their complementary action on growth.
Antagonistic Growth Regulation by Dpp and Fat Drives Uniform Cell Proliferation
W. M. van Rees, A. Leonard, D. I. Pullin, and P. Koumoutsakos, “A comparison of vortex and pseudo-spectral methods for the simulation of periodic vortical flows at high Reynolds numbers,” J. Comput. Phys. vol. 230, no. 8, pp. 2794–2805, 2011. Publisher's VersionAbstract
We present a validation study for the hybrid particle-mesh vortex method against a pseudo-spectral method for the Taylor–Green vortex at Re=1600 as well as in the collision of two antiparallel vortex tubes at Re=10,000. In this study we present diagnostics such as energy spectra and enstrophy as computed by both methods as well as point-wise comparisons of the vorticity field. Using a fourth order accurate kernel for interpolation between the particles and the mesh, the results of the hybrid vortex method and of the pseudo-spectral method agree well in both flow cases. For the Taylor–Green vortex, the vorticity contours computed by both methods around the time of the energy dissipation peak overlap. The energy spectrum shows that only the smallest length scales in the flow are not captured by the vortex method. In the second flow case, where we compute the collision of two anti-parallel vortex tubes at Reynolds number 10,000, the vortex method results and the pseudo-spectral method results are in very good agreement up to and including the first reconnection of the tubes. The maximum error in the effective viscosity is about 2.5% for the vortex method and about 1% for the pseudo-spectral method. At later times the flows computed with the different methods show the same qualitative features, but the quantitative agreement on vortical structures is lost.
D. Rossinelli, C. Conti, and P. Koumoutsakos, “Mesh-particle interpolations on graphics processing units and multicore central processing units,” Philos. T. Roy. Soc. A, vol. 369, no. 1944, pp. 2164–2175, 2011. Publisher's VersionAbstract
Particle–mesh interpolations are fundamental operations for particle-in-cell codes, as implemented in vortex methods, plasma dynamics and electrostatics simulations. In these simulations, the mesh is used to solve the field equations and the gradients of the fields are used in order to advance the particles. The time integration of particle trajectories is performed through an extensive resampling of the flow field at the particle locations. The computational performance of this resampling turns out to be limited by the memory bandwidth of the underlying computer architecture. We investigate how mesh–particle interpolation can be efficiently performed on graphics processing units (GPUs) and multicore central processing units (CPUs), and we present two implementation techniques. The single-precision results for the multicore CPU implementation show an acceleration of 45–70×, depending on system size, and an acceleration of 85–155× for the GPU implementation over an efficient single-threaded C++ implementation. In double precision, we observe a performance improvement of 30–40× for the multicore CPU implementation and 20–45× for the GPU implementation. With respect to the 16-threaded standard C++ implementation, the present CPU technique leads to a performance increase of roughly 2.8–3.7× in single precision and 1.7–2.4× in double precision, whereas the GPU technique leads to an improvement of 9× in single precision and 2.2–2.8× in double precision.
Multicore/multi-GPU accelerated simulations of multiphase compressible flows using wavelet adapted grids
D. Rossinelli, B. Hejazialhosseini, D. G. Spampinato, and P. Koumoutsakos, “Multicore/Multi-GPU Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids,” SIAM J. Sci. Comput. vol. 33, no. 2, pp. 512–540, 2011. Publisher's VersionAbstract
We present a computational method of coupling average interpolating wavelets with high-order finite volume schemes and its implementation on heterogeneous computer architectures for the simulation of multiphase compressible flows. The method is implemented to take advantage of the parallel computing capabilities of emerging heterogeneous multicore/multi-GPU architectures. A highly efficient parallel implementation is achieved by introducing the concept of wavelet blocks, exploiting the task-based parallelism for CPU cores, and by managing asynchronously an array of GPUs by means of OpenCL. We investigate the comparative accuracy of the GPU and CPU based simulations and analyze their discrepancy for two-dimensional simulations of shock-bubble interaction and Richtmeyer–Meshkov instability. The results indicate that the accuracy of the GPU/CPU heterogeneous solver is competitive with the one that uses exclusively the CPU cores. We report the performance improvements by employing up to 12 cores and 6 GPUs compared to the single-core execution. For the simulation of the shock-bubble interaction at Mach 3 with two million grid points, we observe a 100-fold speedup for the heterogeneous part and an overall speedup of 34.
P. D. Koumoutsakos, B. Bayati, F. Milde, and G. Tauriello, “Particle Simulations of Morphogenesis,” Math. Mod. Meth. Appl. S. vol. 21, no. supp01, pp. 955–1006, 2011. Publisher's VersionAbstract
The simulation of the creation and evolution of biological forms requires the development of computational methods that are capable of resolving their hierarchical, spatial and temporal complexity. Computations based on interacting particles, provide a unique computational tool for discrete and continuous descriptions of morphogenesis of systems ranging from the molecular to the organismal level. The capabilities of particle methods hinge on the simplicity of their formulation which enables the formulation of a unifying computational framework encom- passing deterministic and stochastic models. In this paper, we discuss recent advances in particle methods for the simulation of biological systems at the mesoscopic and the macroscale level. We present results from applications of particle methods including reaction-diffusion on deforming surfaces, deterministic and stochastic descriptions of tumor growth and angiogenesis and discuss successes and challenges of this approach.
M. Gazzola, O. V. Vasilyev, and P. Koumoutsakos, “Shape optimization for drag reduction in linked bodies using evolution strategies,” Comput. Struct. vol. 89, no. 11-12, pp. 1224–1231, 2011. Publisher's VersionAbstract
We present results from the shape optimization of linked bodies for drag reduction in simulations of incompressible flow at moderate Reynolds numbers. The optimization relies on the covariance matrix adaptation evolution strategy (CMA-ES) and the flow simulations use vortex methods with the Brinkman penalization to enforce boundary conditions in complex bodies. We exploit the inherent parallelism of CMA-ES, by implementing a multi-host framework which allows for the distribution of the expensive cost function evaluations across parallel architectures, without being limited to one computing facility. This study repeats in silico for the first time Ingo Rechenberg's pioneering wind tunnel experiments for drag reduction that led to the inception of evolution strategies. The simulations confirm that the results of these experimental studies indicate a flat plate is not the optimal solution for drag reduction in linked bodies. We present the vorticity field of the flow and identify the governing mechanisms for this drag reduction by the slightly corrugated linked plate configuration.
Shape optimization for drag reduction in linked bodies using evolution strategies
M. Gazzola, P. Chatelain, W. M. van Rees, and P. Koumoutsakos, “Simulations of single and multiple swimmers with non-divergence free deforming geometries,” J. Comput. Phys. vol. 230, no. 19, pp. 7093–7114, 2011. Publisher's VersionAbstract
We present a vortex particle method coupled with a penalization technique to simulate single and multiple swimmers in an incompressible, viscous flow in two and three dimensions. The proposed algorithm can handle arbitrarily deforming bodies and their corresponding non-divergence free deformation velocity fields. The method is validated on a number of benchmark problems with stationary and moving boundaries. Results include flows of tumbling objects and single and multiple self-propelled swimmers.

Pages