Commit 08522bc9 authored by Valeriu Codreanu's avatar Valeriu Codreanu
Browse files

Merge branch 'master' into 'v1.3'

parents ada719e0 5ed6121e
iwoph17_paper:
image: aergus/latex
script:
- mkdir -vp doc/build/tex
- latexmk -help
- cd doc/iwoph17/ && latexmk -output-directory=../build/tex -pdf t72b.tex
artifacts:
paths:
- doc/build/tex/t72b.pdf
pages:
image: alpine
script:
- apk --no-cache add py2-pip python-dev
- pip install sphinx
- apk --no-cache add make
- ls -al
- pwd
- cd doc/sphinx && make html && cd -
- mv doc/build/sphinx/html public
only:
- master
artifacts:
paths:
- public
.PHONY:doc clean
doc:
$(MAKE) html -C doc/sphinx/
clean:
$(MAKE) clean -C doc/sphinx/
# Unified European Applications Benchmark Suite, version 1.3
# Unified (Accelerated) European Applications Benchmark Suite, version 2.0
The Unified European Application Benchmark Suite (UEABS) is a set of 12 application codes taken from the pre-existing PRACE and DEISA application benchmark suites to form a single suite, with the objective of providing a set of scalable, currently relevant and publically available codes and datasets, of a size which can realistically be run on large systems, and maintained into the future. This work has been undertaken by Task 7.4 "Unified European Applications Benchmark Suite for Tier-0 and Tier-1" in the PRACE Second Implementation Phase (PRACE-2IP) project and will be updated and maintained by subsequent PRACE Implementation Phase projects.
Additionally, this collection includes the Accelerated European Application Benchmark Suite (AEABS), described below
Application performance on accelerators
=======================================
This is an extention of UEABS (Unified European Application Benchmark Suite) for accelerators. It is composed of a set of 11 codes that includes 1 synthetic benchmark and 10 commonly used applications. The key focus of this task has been exploiting accelerators or co-processors to improve the performance of real applications. It aims at providing a set of scalable, currently relevant and publically available codes and datasets.
This work has been undertaken by Task7.2B "Accelerator Benchmarks" in the PRACE Fourth Implementation Phase (PRACE-4IP) project.
Most of the selected application are a subset of UEABS. Exceptions are PFARM which comes from PRACE-2IP and SHOC the synthetic benchmark suite.
For each code, namly Alya, Code_Saturne, CP2K, GROMACS, GPAW, NAMD, PFARM, QCD, Quantum Espresso, SHOC and SPECFEM3D, either two or more test case datasets have been selected. There are described in details in the PRACE deliverable D7.5 of the forth implementation.
Running the suite
-----------------
Inscructions to run each test cases of each codes can be found in the subdirectories of this repository.
For more details of the codes and datasets, and sample results, please see http://www.prace-ri.eu/IMG/pdf/d7.4_3ip.pdf
......@@ -251,3 +271,6 @@ In many geological models in the context of seismic wave propagation studies (ex
- Test Case A: http://www.prace-ri.eu/UEABS/SPECFEM3D/SPECFEM3D_TestCaseA.tar.gz
- Test Case B: http://www.prace-ri.eu/UEABS/SPECFEM3D/SPECFEM3D_TestCaseA.tar.gz
- Run instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/v1.3/specfem3d/SPECFEM3D_Run_README.txt
# Alya - Large Scale Computational Mechanics
Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.
Homepage: https://www.bsc.es/research-development/research-areas/engineering-simulations/alya-high-performance-computational
Alya is avaialble to collaboratoring projects and a specific version is being distributed as part of the PRACE Unified European Applications Benchmark Suite (http://www.prace-ri.eu/ueabs/#ALYA)
## Building Alya for GPU accelerators
The library currently supports four solvers:GMRES, Deflated Conjugate Gradient, Conjugate Gradient, and Pipelined Conjugate Gradient.
The only pre-conditioner supported at the moment is 'diagonal'.
Keywords to use the solvers:
```shell
NINJA GMRES : GGMR
NINJA Deflated CG : GDECG
NINJA CG : GCG
NINJA Pipelined CG : GPCG
PRECONDITIONER : DIAGONAL
```
Other options are same a CPU based solver.
### GPGPU Building
This version was tested with the Intel Compilers 2017.1, bullxmpi-1.2.9.1 and NVIDIA CUDA 7.5. Ensure that the wrappers `mpif90` and `mpicc` point to the correct binaries and that `$CUDA_HOME` is set.
Alya can be used with just MPI or hybrid MPI-OpenMP parallelism. Standard execution mode is to rely on MPI only.
- Uncompress the source and configure the depending Metis library and Alya build options:
```shell
tar xvf alya-prace-acc.tar.bz2
```
- Edit the file `Alya/Thirdparties/metis-4.0/Makefile.in` to select the compiler and target platform. Uncomment the specific lines and add optimization parameters, e.g.
```shell
OPTFLAGS = -O3 -xCORE-AVX2
```
- Then build Metis4
```shell
$ cd Alya/Executables/unix
$ make metis4
```
- For Alya there are several example configurations, copy one, e.g. for Intel Compilers:
```shell
$ cp configure.in/config_ifort.in config.in
```
- Edit the config.in:
Add the corresponding platform optimization flags to `FCFLAGS`, e.g.
```shell
FCFLAGS = -module $O -c -xCORE-AVX2
```
- MPI: No changes in the configure file are necessary. By default you use metis4 and 4 byte integers.
- MPI-hybrid (with OpenMP) : Uncomment the following lines for OpenMP version:
```shell
CSALYA := $(CSALYA) -qopenmp (-fopenmp for GCC Compilers)
EXTRALIB := $(EXTRALIB) -qopenmp (-fopenmp for gcc Compilers)
```
- Configure and build Alya (-x Release version; -g Debug version, plus uncommenting debug and checking flags in config.in)
```shell
./configure -x nastin parall
make NINJA=1 -j num_processors
```
### GPGPU Usage
Each problem needs a `GPUconfig.dat`. A sample is available at `Alya/Thirdparties/ninja` and needs to be copied to the work directory. A README file in the same location provides further information.
- Extract the small one node test case and configure to use GPU solvers:
```shell
$ tar xvf cavity1_hexa_med.tar.bz2 && cd cavity1_hexa_med
$ cp ../Alya/Thirdparties/ninja/GPUconfig.dat .
```
- To use the GPU, you have to replace `GMRES` with `GGMR` and `DEFLATED_CG` with `GDECG`, both in `cavity1_hexa.nsi.dat`
- Edit the job script to submit the calculation to the batch system.
```shell
job.sh: Modify the path where you have your Alya.x (compiled with MPI options)
sbatch job.sh
```
Alternatively execute directly:
```shell
OMP_NUM_THREADS=4 mpirun -np 16 Alya.x cavity1_hexa
```
<!-- Runtime on 16-core Xeon E5-2630 v3 @ 2.40GHz with 2 NVIDIA K80: ~1:30 min -->
<!-- Runtime on 16-core Xeon E5-2630 v3 @ 2.40GHz no GPU: ~2:00 min -->
## Building Alya for Intel Xeon Phi Knights Landing (KNL)
The Xeon Phi processor version of Alya is currently relying on compiler assisted optimization for AVX-512. Porting of performance critical kernels to the new assembly instructions is underway. There will not be a version for first generation Xeon Phi Knights Corner coprocessors.
### KNL Building
This version was tested with the Intel Compilers 2017.1, Intel MPI 2017.1. Ensure that the wrappers `mpif90` and `mpicc` point to the correct binaries.
Alya can be used with just MPI or hybrid MPI-OpenMP parallelism. Standard execution mode is to rely on MPI only.
- Uncompress the source and configure the depending Metis library and Alya build options:
```shell
tar xvf alya-prace-acc.tar.bz2
```
- Edit the file `Alya/Thirdparties/metis-4.0/Makefile.in` to select the compiler and target platform. Uncomment the specific lines and add optimization parameters, e.g.
```shell
OPTFLAGS = -O3 -xMIC-AVX512
```
- Then build Metis4
```shell
$ cd Alya/Executables/unix
$ make metis4
```
- For Alya there are several example configurations, copy one, e.g. for Intel Compilers:
```shell
$ cp configure.in/config_ifort.in config.in
```
- Edit the config.in:
Add the corresponding platform optimization flags to `FCFLAGS`, e.g.
```shell
FCFLAGS = -module $O -c -xMIC-AVX512
```
- MPI: No changes in the configure file are necessary. By default you use metis4 and 4 byte integers.
- MPI-hybrid (with OpenMP) : Uncomment the following lines for OpenMP version:
```shell
CSALYA := $(CSALYA) -qopenmp (-fopenmp for GCC Compilers)
EXTRALIB := $(EXTRALIB) -qopenmp (-fopenmp for gcc Compilers)
```
- Configure and build Alya (-x Release version; -g Debug version, plus uncommenting debug and checking flags in config.in)
```shell
./configure -x nastin parall
make -j num_processors
```
### KNL Usage
- Extract the small one node test case.
```shell
$ tar xvf cavity1_hexa_med.tar.bz2 && cd cavity1_hexa_med
$ cp ../Alya/Thirdparties/ninja/GPUconfig.dat .
```
- Edit the job script to submit the calculation to the batch system.
```shell
job.sh: Modify the path where you have your Alya.x (compiled with MPI options)
sbatch job.sh
```
Alternatively execute directly:
```shell
OMP_NUM_THREADS=4 mpirun -np 16 Alya.x cavity1_hexa
```
<!-- Runtime on 68-core Xeon Phi(TM) CPU 7250 1.40GHz: ~3:00 min -->
## Remarks
If the number of elements is too low for a scalability analysis, Alya includes a mesh multiplication technique. This tool can be used by selecting an input option in the ker.dat file. This option is the number of mesh multiplication levels one wants to apply (0 meaning no mesh multiplication). At each multiplication level, the number of elements is multiplied by 8, so one can obtain a huge mesh automatically in order to study the scalability of the code on different architectures. Note that the mesh multiplication is carried out in parallel and thus should not impact the duration of the simulation process.
************************************************************************************
Code_Saturne 4.2.2 is linked to PETSC developer's version, in order to benefit from
its GPU implementation. Note that the normal release of PETSC does not support GPU.
************************************************************************************
Installation
************************************************************************************
The version has been tested for K80s, and with the following settings:-
-OPENMPI 2.0.2
-GCC 4.8.5
-CUDA 7.5
To install Code_Saturne 4.2.2, 4 libraries are required, BLAS, LAPACK, SOWING and CUSP.
The tests have been carried out with lapack-3.6.1 for BLAS and LAPACK, sowing-1.1.23-p1
for SOWING and cusplibrary-0.5.1 for CUSP.
PETSC is first installed, and PATH_TO_PETSC, PATH_TO_CUSP, PATH_TO_SOWING, PATH_TO_LAPACK
have to be updated in INSTALL_PETSC_GPU_sm37 under petsc-petsc-a31f61e8abd0
PETSC is configured for K80s, ./INSTALL_PETSC_GPU_sm37 is used from petsc-petsc-a31f61e8abd0
It is finally compiled and installed, by typing make and make install.
Before installing Code_Saturne, adapt PATH_TO_PETSC in InstallHPC.sh under SATURNE_4.2.2
and type ./InstallHPC.sh
The code should be installed and code_saturne be found under:
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne, which should return:
Usage: ./code_saturne <topic>
Topics:
help
autovnv
bdiff
bdump
compile
config
create
gui
info
run
salome
submit
Options:
-h, --help show this help message and exit
************************************************************************************
Test case - Cavity 13M
************************************************************************************
In CAVITY_13M.tar.gz are found the mesh+partitions and 2 sets of subroutines, one for CPU and the
second one for GPU, i.e.:
CAVITY_13M/PETSC_CPU/SRC/*
CAVITY_13M/PETSC_GPU/SRC/*
CAVITY_13M/MESH/mesh_input_13M
CAVITY_13M/MESH/partition_METIS_5.1.0/*
To prepare a run, it is required to set up a "study" with 2 directories, one for CPU and the other one for GPU
as, for instance:
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --study NEW_CAVITY_13M PETSC_CPU
cd NEW_CAVITY_13M
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --case PETSC_GPU
The mesh has to be copied from CAVITY_13M/MESH/mesh_input_13M into NEW_CAVITY_13M/MESH/.
And the same has to be done for partition_METIS_5.1.0.
The subroutines contained in CAVITY_13M/PETSC_CPU/SRC should be copied into NEW_CAVITY_13M/PETSC_CPU/SRC and
the subroutines contained in CAVITY_13M/PETSC_GPU/SRC should be copied into NEW_CAVITY_13M/PETSC_GPU/SRC.
In each DATA subdirectory of NEW_CAVITY_13M/PETSC_CPU and NEW_CAVITY_13M/PETSC_GPU, the path
to the mesh+partition has to be set as:
cd DATA
cp REFERENCE/cs_user_scripts.py .
edit cs_user_scripts.py
At line 138, change None to "../MESH/mesh_input_13M"
At line 139, change None to "../MESH/partition_METIS_5.1.0"
At this stage, everything is set to run both simulations, one for the CPU and the other one for the GPU.
cd NEW_CAVITY_13M/PETSC_CPU
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
submit the job
cd NEW_CAVITY_13M/PETSC_GPU
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
submit the job
************************************************************************************
Code_Saturne 4.2.2 is installed for KNLs. It is also linked to PETSC, but the
default linear solvers are the native ones.
************************************************************************************
Installation
************************************************************************************
The installation script is:
SATURNE_4.2.2/InstallHPC_with_PETSc.sh
The path to PETSC (official released version, and therefore assumed to be installed
on the machine) should be added to the aforementioned script.
After typing ./InstallHPC_with_PETSc.sh the code should be installed and code_saturne be found under:
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne, which should return:
Usage: ./code_saturne <topic>
Topics:
help
autovnv
bdiff
bdump
compile
config
create
gui
info
run
salome
submit
Options:
-h, --help show this help message and exit
************************************************************************************
Two cases are dealt with, TGV_256_CS_OPENMP.tar.gz to test the native solvers, and
CAVITY_13M_FOR_KNLs_WITH_PETSC.tar.gz to test Code_Saturne and PETSC on KNLs
************************************************************************************
First test case: TGV_256_CS_OPENMP.tar.gz
************************************************************************************
In TGV_256_CS_OPENMP.tar.gz are found the mesh and the set of subroutines for Code_Saturne on KNLs, i.e.:
TGV_256_CS_OPENMP/MESH/mesh_input_256by256by256
TGV_256_CS_OPENMP/ARCHER_KNL/SRC/*
To prepare a run, it is required to set up a "study" as, for instance:
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --study NEW_TGV_256_CS_OPENMP KNL
The mesh has to be copied from TGV_256_CS_OPENMP/MESH/mesh_input_256by256by256 into NEW_TGV_256_CS_OPENMP/MESH/.
The subroutines contained in TGV_256_CS_OPENMP/KNL/SRC should be copied into NEW_TGV_256_CS_OPENMP/KNL/SRC
In the DATA subdirectory of NEW_TGV_256_CS_OPENMP/KNL the path to the mesh has to be set as:
cd DATA
cp REFERENCE/cs_user_scripts.py .
edit cs_user_scripts.py
At line 138, change None to "../MESH/mesh_input_256by256by256"
At this stage, everything is set to run the simulation:
cd TGV_256_CS_OPENMP/KNL
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
submit the job
************************************************************************************
Second test case: CAVITY_13M_FOR_KNLs_WITH_PETSC.tar.gz
************************************************************************************
In CAVITY_13M.tar.gz are found the mesh and the set of subroutines for PETSC and KNLs, i.e.:
CAVITY_13M/MESH/mesh_input
CAVITY_13M/KNL/SRC/*
To prepare a run, it is required to set up a "study" as, for instance:
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --study NEW_CAVITY_13M PETSC_KNL
The mesh has to be copied from CAVITY_13M/MESH/mesh_input into NEW_CAVITY_13M/MESH/.
The subroutines contained in CAVITY_13M/KNL/SRC should be copied into NEW_CAVITY_13M/PETSC_KNL/SRC
In the DATA subdirectory of NEW_CAVITY_13M/PETSC_KNL the path to the mesh has to be set as:
cd DATA
cp REFERENCE/cs_user_scripts.py .
edit cs_user_scripts.py
At line 138, change None to "../MESH/mesh_input"
At this stage, everything is set to run the simulation:
cd NEW_CAVITY_13M/PETSC_KNL
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
submit the job
# Code_Saturn
## GPU Version
Code_Saturne 4.2.2 is linked to PETSC developer's version, in order to benefit from
its GPU implementation. Note that the normal release of PETSC does not support GPU.
### Installation
The version has been tested for K80s, and with the following settings:
* OPENMPI 2.0.2
* GCC 4.8.5
* CUDA 7.5
To install Code_Saturne 4.2.2, 4 libraries are required, BLAS, LAPACK, SOWING and CUSP.
The tests have been carried out with lapack-3.6.1 for BLAS and LAPACK, sowing-1.1.23-p1
for SOWING and cusplibrary-0.5.1 for CUSP.
PETSC is first installed, and `PATH_TO_PETSC`, `PATH_TO_CUSP`, `PATH_TO_SOWING`, `PATH_TO_LAPACK`
have to be updated in `INSTALL_PETSC_GPU_sm37` under `petsc-petsc-a31f61e8abd0`
PETSC is configured for K80s, `./INSTALL_PETSC_GPU_sm37` is used from `petsc-petsc-a31f61e8abd0`
It is finally compiled and installed, by typing `make` and `make install`.
Before installing Code\_Saturne, adapt `PATH_TO_PETSC` in `InstallHPC.sh` under `SATURNE_4.2.2`
and type `./InstallHPC.sh`
The code should be installed and code_saturne be found under:
```
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne
```
And should return:
```
Usage: ./code_saturne <topic>
Topics:
help
autovnv
bdiff
bdump
compile
config
create
gui
info
run
salome
submit
Options:
-h, --help show this help message and exit
```
### Test case - Cavity 13M
In `CAVITY_13M.tar.gz` are found the mesh+partitions and 2 sets of subroutines, one for CPU and the
second one for GPU, i.e.:
```
CAVITY_13M/PETSC_CPU/SRC/*
CAVITY_13M/PETSC_GPU/SRC/*
CAVITY_13M/MESH/mesh_input_13M
CAVITY_13M/MESH/partition_METIS_5.1.0/*
```
To prepare a run, it is required to set up a "study" with 2 directories, one for CPU and the other one for GPU
as, for instance:
```
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --study NEW_CAVITY_13M PETSC_CPU
cd NEW_CAVITY_13M
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne create --case PETSC_GPU
```
The mesh has to be copied from `CAVITY_13M/MESH/mesh_input_13M` into `NEW_CAVITY_13M/MESH/`.
And the same has to be done for `partition_METIS_5.1.0`.
The subroutines contained in `CAVITY_13M/PETSC_CPU/SRC` should be copied into `NEW_CAVITY_13M/PETSC_CPU/SRC` and
the subroutines contained in `CAVITY_13M/PETSC_GPU/SRC` should be copied into `NEW_CAVITY_13M/PETSC_GPU/SRC`.
In each DATA subdirectory of `NEW_CAVITY_13M/PETSC_CPU` and `NEW_CAVITY_13M/PETSC_GPU`, the path
to the mesh+partition has to be set as:
```
cd DATA
cp REFERENCE/cs_user_scripts.py .
```
edit cs_user_scripts.py:
* At line 138, change `None` to `../MESH/mesh_input_13M`
* At line 139, change `None` to `../MESH/partition_METIS_5.1.0`
At this stage, everything is set to run both simulations, one for the CPU and the other one for the GPU.
```
cd NEW_CAVITY_13M/PETSC_CPU
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
```
Then submit the job.
```
cd NEW_CAVITY_13M/PETSC_GPU
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne run --initialize
cd RESU/YYYYMMDD-HHMM
```
submit the job
## KNL Version
Code_Saturne 4.2.2 is installed for KNLs. It is also linked to PETSC, but the
default linear solvers are the native ones.
### Installation
The installation script is:
```
SATURNE_4.2.2/InstallHPC_with_PETSc.sh
```
The path to PETSC (official released version, and therefore assumed to be installed
on the machine) should be added to the aforementioned script.
After typing `./InstallHPC_with_PETSc.sh` the code should be installed and code_saturne be found under:
```
PATH_TO_CODE_SATURNE/SATURNE_4.2.2/code_saturne-4.2.2/arch/Linux/bin/code_saturne
```
And should return:
```
Usage: ./code_saturne <topic>
Topics:
help
autovnv
bdiff
bdump
compile
config
create
gui
info
run
salome
submit
Options:
-h, --help show this help message and exit
```
### Running the code
Two cases are dealt with, `TGV_256_CS_OPENMP.tar.gz` to test the native solvers, and
`CAVITY_13M_FOR_KNLs_WITH_PETSC.tar.gz` to test Code_Saturne and PETSC on KNLs
#### First test case: TGV_256_CS_OPENMP
In `TGV_256_CS_OPENMP.tar.gz` are found the mesh and the set of subroutines for Code_Saturne on KNLs, i.e.:
```
TGV_256_CS_OPENMP/MESH/mesh_input_256by256by256
TGV_256_CS_OPENMP/ARCHER_KNL/SRC/*
```