This section will present results of UEABS on both GPU and KNL systems. This benchmark suite is made of two set of codes that covers each other's. The former is used to be run on standard CPU and de latest have been ported to accelerators. The accelerated suite is described in the PRACE 4IP Deliverable 7.5. And the standard suite is described on the PRACE UEABS official webpage.
This section will present results of UEABS on both GPU and KNL systems. This benchmark suite is made of two set of codes that covers each other's. The former is used to be run on standard CPU and de latest have been ported to accelerators. The accelerated suite is described in the PRACE 4IP Deliverable 7.5 and the standard suite is described on the PRACE UEABS official webpage.
Metrics exhibited systematically will be time to solution and energy to solution. This choice allows to measure the exact same computation. Indeed, some code features specific performance metrics, e.g. not considering warm up and teardown phases. This metrics are thus not biased and small benchmark test cases can then give more information about an hypothetic production runs. Unfortunately, such a system is not available yet for energy, and this metrics will be shown as *side metrics*.
In order to be comparable between machines, the :code:`Cumulative (all nodes) Total energy (J)` has been selected for the GPU machine. And the :code:`nodes.energy` has been selected for the KNL prototype. Both measure full nodes consumption in Joules.
Each code will be presented along with a short description and the full set of metrics. The section ends with a recap chart with a line of metric picked up for its relevance.
Each code will be presented along with the full set of metrics. The section ends with a recap chart with a line of metric picked up for its relevance.
ALYA
^^^^
Alya is a high performance computational mechanics code that can solve different coupled mechanics problems.
The code is parallelised with MPI and OpenMP. Two OpenMP strategies are available, without and with a colouring strategy to avoid ATOMICs during the assembly step. A CUDA version is also available for the different solvers.
Code_Saturne
^^^^^^^^^^^^
Code_Saturne is a CFD software package developed by EDF R&D since 1997 and open-source since 2007.
Parallelism is handled by distributing the domain over the processors. Communications between subdomains are handled by MPI. Hybrid parallelism using MPI/OpenMP has recently been optimised for improved multicore performance. PETSc has recently been linked to the code to offer alternatives to the internal solvers to compute the pressure and supports CUDA.
CP2K
^^^^
CP2K is a quantum chemistry and solid state physics software package.
Parallelisation is achieved using a combination of OpenMP-based multi-threading and MPI. Offloading for accelerators is implemented through CUDA.
GADGET
^^^^^^
GENE
^^^^
GPAW
^^^^
GPAW is a DFT program for ab-initio electronic structure calculations using the projector augmented wave method.
GPAW is written mostly in Python, but includes also computational kernels written in C as well as leveraging external libraries such as NumPy, BLAS and ScaLAPACK. Support for offloading to accelerators using either CUDA or pyMIC, respectively.
GROMACS
^^^^^^^
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
Parallelisation is achieved using combined OpenMP and MPI. Offloading for accelerators is implemented through CUDA for GPU and through OpenMP for MIC (Intel Xeon Phi).
NAMD
^^^^
NAMD is a widely used molecular dynamics application designed to simulate bio-molecular systems on a wide variety of compute platforms.
It is written in C++ and parallelised using Charm++ parallel objects, which are implemented on top of MPI.
NEMO
^^^^
PFARM
^^^^^
PFARM is part of a suite of programs based on the ‘R-matrix’ ab-initio approach to the varitional solution of the many-electron Schrödinger equation for electron-atom and electron-ion scattering.
It is parallelised using hybrid MPI / OpenMP and CUDA offloading to GPU.
QCD
^^^
Quantum Espresso
^^^^^^^^^^^^^^^^
QUANTUM ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modelling, based on density-functional theory, plane waves, and pseudopotentials.
It is implemented using MPI and CUDA offloading to GPU.
SHOC
^^^^
The Accelerator Benchmark Suite will also include a series of synthetic benchmarks.
SHOC is written in C++ is MPI-based. Offloading for accelerators is implemented through CUDA and OpenCL for GPU.
Specfem3D_Globe
^^^^^^^^^^^^^^^
The software package SPECFEM3D_Globe simulates three-dimensional global and regional seismic wave propagation based upon the spectral-element method.
It is written in Fortran and uses MPI combined with OpenMP to achieve parallelisation.