Commit a2bf08ee authored by Holly Judge's avatar Holly Judge
Browse files

Merge branch 'r2.2-dev' of https://repository.prace-ri.eu/git/UEABS/ueabs into r2.2-dev

parents 127e20c1 7940f5c2
......@@ -8,12 +8,12 @@ The UEABS has been and will be actively updated and maintained by the subsequent
Each application code has either one, or two input datasets. If there are two datasets, Test Case A is designed to run on Tier-1 sized systems (up to around 1,000 x86 cores, or equivalent) and Test Case B is designed to run on Tier-0 sized systems (up to around 10,000 x86 cores, or equivalent). If there is only one dataset (Test Case A), it is suitable for both sizes of system.
Contacts: Valeriu Codreanu <mailto:valeriu.codreanu@surfsara.nl> or Walter Lioen <mailto:walter.lioen@surfsara.nl>
Contacts: (Ok to mention all BCOs here?, ask PMO for a UEABS contact mailing list address?), Walter Lioen <mailto:walter.lioen@surf.nl>
Current Release
---------------
The current release is Version 2.1 (April 30, 2019).
The current release is Version 2.2 (December 31, 2021).
See also the [release notes and history](RELEASES.md).
Running the suite
......@@ -21,10 +21,196 @@ Running the suite
Instructions to run each test cases of each codes can be found in the subdirectories of this repository.
For more details of the codes and datasets, and sample results, please see the PRACE-5IP benchmarking deliverable D7.5 "Evaluation of Accelerated and Non-accelerated Benchmarks" (April 18, 2019) at http://www.prace-ri.eu/public-deliverables/ .
For more details of the codes and datasets, and sample results, please see the PRACE-6IP benchmarking deliverable D7.5 "Evaluation of Benchmark Performance" (November 30, 2021) at http://www.prace-ri.eu/public-deliverables/ .
The application codes that constitute the UEABS are:
---------------------------------------------------
<table>
<thead>
<tr>
<th rowspan="2">Application</th>
<th rowspan="2">Lines of<br/>Code</th>
<th colspan="3">Parallelism</th>
<th colspan="4">Language</th>
<th rowspan="2">Code Description/Notes</th>
</tr>
<tr>
<th>MPI</th>
<th>OpenMP/<br/>Pthreads</th>
<th>GPU</th>
<th>Fortran</th>
<th>Python</th>
<th>C</th>
<th>C++</th>
</tr>
</thead>
<tbody>
<tr>
<td>Alya
<ul>
<li><a href="https://www.bsc.es/computer-applications/alya-system">website</a></li>
<li><a href="https://gitlab.com/bsc-alya/open-alya">source</a></li>
<li><a href="alya/README.md">instructions</a></li>
<li><a href="https://gitlab.com/bsc-alya/benchmarks/sphere-16M">Test Case A</a></li>
<li><a href="https://gitlab.com/bsc-alya/benchmarks/sphere-132M">Test Case B</a></li>
</ul>
</td>
<td>600,000</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td>The Alya System is a Computational Mechanics code capable of solving different physics, each one with its own modelization characteristics, in a coupled way. Among the problems it solves are: convection-diffusion reactions, incompressible flows, compressible flows, turbulence, bi-phasic flows and free surface, excitable media, acoustics, thermal flow, quantum mechanics (DFT) and solid mechanics (large strain).</td>
</tr>
<tr>
<td>Code_Saturne</td>
<td>~350,000</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>The code solves the Navier-Stokes equations for imcompressible/compressible flows using a predictor-corrector technique. The Poisson pressure equation is solved by a Conjugate Gradient preconditioned by a multi-grid algorithm, and the transport equations by Conjugate Gradient-like methods. Advanced gradient reconstruction is also available to account for distorted meshes.</td>
</tr>
<tr>
<td>CP2K</td>
<td>~1,150,000</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td>CP2K is a freely available quantum chemistry and solid-state physics software package for performing atomistic simulations. It can be run with MPI, OpenMP and CUDA. All of CP2K is MPI parallelised, with some routines making use of OpenMP, which can be used to reduce the memory footprint. In addition some linear algebra operations may be offloaded to GPUs using CUDA.</td>
</tr>
<tr>
<td>GADGET</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>GPAW</td>
<td>132,000</td>
<td>X</td>
<td></td>
<td>X</td>
<td></td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
</tr>
<tr>
<td>GROMACS</td>
<td>3,227,337</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td>X</td>
<td>X</td>
<td></td>
</tr>
<tr>
<td>NAMD</td>
<td>1,992,651</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td>X</td>
<td></td>
</tr>
<tr>
<td>NEMO</td>
<td>154,240</td>
<td>X</td>
<td></td>
<td></td>
<td>X</td>
<td></td>
<td></td>
<td>X</td>
<td>NEMO (Nucleus for European Modelling of the Ocean) is a mathematical modelling framework for research activities and prediction services in ocean and climate sciences developed by a European consortium. It is intended to be a tool for studying the ocean and its interaction with the other components of the earth climate system over a large number of space and time scales. It comprises of the core engines namely OPA (ocean dynamics and thermodynamics), SI3 (sea ice dynamics and thermodynamics), TOP (oceanic tracers) and PISCES (biogeochemical process).</td>
</tr>
<tr>
<td>PFARM</td>
<td>21,434</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td>PFARM uses an R-matrix ab-initio approach to calculate electron-atom and electron-molecule collisions data for a wide range of applications including atrophysics and nuclear fusion. It is written in modern Fortran/MPI/OpenMP and exploits highly-optimised dense linear algebra numerical library routines.</td>
</tr>
<tr>
<td>QCD</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Quantum&nbsp;ESPRESSO</td>
<td>92,996</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SPECFEM3D</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>TensorFlow</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
- [ALYA](#alya)
- [Code_Saturne](#saturne)
......@@ -46,11 +232,10 @@ The application codes that constitute the UEABS are:
The Alya System is a Computational Mechanics code capable of solving different physics, each one with its own modelization characteristics, in a coupled way. Among the problems it solves are: convection-diffusion reactions, incompressible flows, compressible flows, turbulence, bi-phasic flows and free surface, excitable media, acoustics, thermal flow, quantum mechanics (DFT) and solid mechanics (large strain). ALYA is written in Fortran 90/95 and parallelized using MPI and OpenMP.
- Web site: https://www.bsc.es/computer-applications/alya-system
- Code download: https://repository.prace-ri.eu/ueabs/ALYA/2.1/Alya.tar.gz
- Build instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r2.1/alya/ALYA_Build_README.txt
- Test Case A: https://repository.prace-ri.eu/ueabs/ALYA/2.1/TestCaseA.tar.gz
- Test Case B: https://repository.prace-ri.eu/ueabs/ALYA/2.1/TestCaseB.tar.gz
- Run instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r2.1/alya/ALYA_Run_README.txt
- Code download: https://gitlab.com/bsc-alya/open-alya
- Build and run instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/-/blob/r2.2-dev/alya/README.md
- Test Case A: https://gitlab.com/bsc-alya/benchmarks/sphere-16M
- Test Case B: https://gitlab.com/bsc-alya/benchmarks/sphere-132M
# Code_Saturne <a name="saturne"></a>
......@@ -83,14 +268,20 @@ CP2K is written in Fortran 2008 and can be run in parallel using a combination o
# GADGET <a name="gadget"></a>
GADGET is a freely available code for cosmological N-body/SPH simulations on massively parallel computers with distributed memory written by Volker Springel, Max-Plank-Institute for Astrophysics, Garching, Germany. GADGET is written in C and uses an explicit communication model that is implemented with the standardized MPI communication interface. The code can be run on essentially all supercomputer systems presently in use, including clusters of workstations or individual PCs. GADGET computes gravitational forces with a hierarchical tree algorithm (optionally in combination with a particle-mesh scheme for long-range gravitational forces) and represents fluids by means of smoothed particle hydrodynamics (SPH). The code can be used for studies of isolated systems, or for simulations that include the cosmological expansion of space, either with, or without, periodic boundary conditions. In all these types of simulations, GADGET follows the evolution of a self-gravitating collisionless N-body system, and allows gas dynamics to be optionally included. Both the force computation and the time stepping of GADGET are fully adaptive, with a dynamic range that is, in principle, unlimited. GADGET can therefore be used to address a wide array of astrophysics interesting problems, ranging from colliding and merging galaxies, to the formation of large-scale structure in the Universe. With the inclusion of additional physical processes such as radiative cooling and heating, GADGET can also be used to study the dynamics of the gaseous intergalactic medium, or to address star formation and its regulation by feedback processes.
GADGET-4 (GAlaxies with Dark matter and Gas intEracT), an evolved and improved version of GADGET-3, is a freely available code for cosmological N-body/SPH simulations on massively parallel computers with distributed memory written mainly by Volker Springel, Max-Plank-Institute for Astrophysics, Garching, Germany, nd benefiting from numerous contributions, including Ruediger Pakmor, Oliver Zier, and Martin Reinecke. GADGET-4 supports collisionless simulations and smoothed particle hydrodynamics on massively parallel computers. All communication between concurrent execution processes is done either explicitly by means of the message passing interface (MPI), or implicitly through shared-memory accesses on processes on multi-core nodes. The code is mostly written in ISO C++ (assuming the C++11 standard), and should run on all parallel platforms that support at least MPI-3. So far, the compatibility of the code with current Linux/UNIX-based platforms has been confirmed on a large number of systems.
The code can be used for plain Newtonian dynamics, or for cosmological integrations in arbitrary cosmologies, both with or without periodic boundary conditions. Stretched periodic boxes, and special cases such as simulations with two periodic dimensions and one non-periodic dimension are supported as well. The modeling of hydrodynamics is optional. The code is adaptive both in space and in time, and its Lagrangian character makes it particularly suitable for simulations of cosmic structure formation. Several post-processing options such as group- and substructure finding, or power spectrum estimation are built in and can be carried out on the fly or applied to existing snapshots. Through a built-in cosmological initial conditions generator, it is also particularly easy to carry out cosmological simulations. In addition, merger trees can be determined directly by the code.
- Web site: https://wwwmpa.mpa-garching.mpg.de/gadget4
- Code download: https://gitlab.mpcdf.mpg.de/vrs/gadget4
- Build and run instructions: https://wwwmpa.mpa-garching.mpg.de/gadget4/02_running.html
- Benchmarks:
- [Case A: Colliding galaxies with star formation](./gadget/4.0/gadget4-case-A.tar.gz)
- [Case B: Cosmological DM-only simulation with IC creation](./gadget/4.0/gadget4-case-B.tar.gz)
- [Case C: Adiabatic collapse of a gas sphere](./gadget/4.0/gadget4-case-C.tar.gz)
- [Code used in the benchmarks](./gadget/4.0/gadget4.tar.gz)
- [Build & run instructions, details about the benchmarks](./gadget/4.0/README.md)
- Web site: http://www.mpa-garching.mpg.de/gadget/
- Code download: https://repository.prace-ri.eu/ueabs/GADGET/gadget3_Source.tar.gz
- Disclaimer: please note that by downloading the code from this website, you agree to be bound by the terms of the GPL license.
- Build instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r1.3/gadget/gadget3_Build_README.txt
- Test Case A: https://repository.prace-ri.eu/ueabs/GADGET/gadget3_TestCaseA.tar.gz
- Run instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r1.3/gadget/gadget3_Run_README.txt
# GPAW <a name="gpaw"></a>
......@@ -103,14 +294,15 @@ The equations of the (time-dependent) density functional theory within the PAW m
The program offers several parallelization levels. The most basic parallelization strategy is domain decomposition over the real-space grid. In magnetic systems it is possible to parallelize over spin, and in systems that have k-points (surfaces or bulk systems) parallelization over k-points is also possible. Furthermore, parallelization over electronic states is possible in DFT and in real-time TD-DFT calculations. GPAW is written in Python and C and parallelized with MPI.
- Web site: https://wiki.fysik.dtu.dk/gpaw/
- Code download: https://gitlab.com/gpaw/gpaw
- Build instructions: [gpaw/README.md#install](gpaw/README.md#install)
- Code download: [gpaw GitLab repository](https://gitlab.com/gpaw/gpaw) or [gpaw on
PyPi](https://pypi.org/project/gpaw/)
- Build instructions: [gpaw README, section "Mechanics of building the benchmark"](gpaw/README.md#mechanics-of-building-the-benchmark)
- Benchmarks:
- [Case S: Carbon nanotube](gpaw/benchmark/carbon-nanotube)
- [Case M: Copper filament](gpaw/benchmark/copper-filament)
- [Case L: Silicon cluster](gpaw/benchmark/silicon-cluster)
- [Case S: Carbon nanotube](gpaw/benchmark/1_S_carbon-nanotube/input.py)
- [Case M: Copper filament](gpaw/benchmark/2_M_copper-filament/input.py)
- [Case L: Silicon cluster](gpaw/benchmark/3_L_silicon-cluster/input.py)
- Run instructions:
[gpaw/README.md#running-the-benchmarks](gpaw/README.md#running-the-benchmarks)
[gpaw README, section "Mechanics of running the benchmark"](gpaw/README.md#mechanics-of-running-the-benchmark)
# GROMACS <a name="gromacs"></a>
......@@ -275,7 +467,7 @@ The SHOC benchmark suite currently contains benchmark programs, categoried based
# SPECFEM3D <a name="specfem3d"></a>
| **General information** | **Scientific field** | **Language** | **MPI** | **OpenMP** | **GPU** | **LoC** | **Code description** |
|------------------|----------------------|--------------|---------|------------|---------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| [- Website](https://geodynamics.org/cig/software/specfem3d_globe/) <br>[- Source](https://github.com/geodynamics/specfem3d_globe.git) <br>[- Bench](https://repository.prace-ri.eu/git/UEABS/ueabs/tree/r2.1-dev/specfem3d) <br>[- Summary](https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r2.1-dev/specfem3d/PRACE_UEABS_Specfem3D_summary.pdf) | Geodynamics | Fortran | yes | yes | Yes (CUDA) | 140000 | The software package SPECFEM3D simulates three-dimensional global and regional seismic wave propagation based upon the spectral-element method (SEM). |
| [- Website](https://geodynamics.org/cig/software/specfem3d_globe/) <br>[- Source](https://github.com/geodynamics/specfem3d_globe.git) <br>[- Bench](https://repository.prace-ri.eu/git/UEABS/ueabs/tree/r2.1-dev/specfem3d) <br>[- Summary](https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r2.1-dev/specfem3d/PRACE_UEABS_Specfem3D_summary.pdf) | Geodynamics | Fortran & C | yes | yes | Yes (CUDA) | 100k Fortran & 20k C | The software package SPECFEM3D simulates three-dimensional global and regional seismic wave propagation based upon the spectral-element method (SEM). |
# TensorFlow <a name="tensorflow"></a>
......
# UEABS Releases
## Version 2.2 (PRACE-6IP, December 31, 2021)
* Changed the presentation, making it similar to the CORAL Benchmarks (cf. <a href="https://asc.llnl.gov/coral-benchmarks">CORAL Benchmarks</a> and <a href="https://asc.llnl.gov/coral-2-benchmarks">CORAL-2 Benchmarks</a>)
* Removed the SHOC benchmark suite
* Added the TensorFlow benchmark
* Alya ...
* ...
* ...
* TensorFlow ...
* Updated the benchmark suite to the status as used for the PRACE-5IP benchmarking deliverable D7.5 "Evaluation of Benchmark Performance" (November 30, 2021)
## Version 2.1 (PRACE-5IP, April 30, 2019)
* Updated the benchmark suite to the status as used for the PRACE-5IP benchmarking deliverable D7.5 "Evaluation of Accelerated and Non-accelerated Benchmarks" (April 18, 2019)
......
......@@ -11,45 +11,43 @@ The Alya System is a Computational Mechanics code capable of solving different p
* Web site: https://www.bsc.es/computer-applications/alya-system
* Code download: https://repository.prace-ri.eu/ueabs/ALYA/2.1/Alya.tar.gz
* Code download: https://gitlab.com/bsc-alya/open-alya
* Test Case A: https://repository.prace-ri.eu/ueabs/ALYA/2.1/TestCaseA.tar.gz
* Test Case A: https://gitlab.com/bsc-alya/benchmarks/sphere-16M
* Test Case B: https://repository.prace-ri.eu/ueabs/ALYA/2.1/TestCaseB.tar.gz
* Test Case B: https://gitlab.com/bsc-alya/benchmarks/sphere-132M
## Mechanics of Building Benchmark
Alya builds the makefile from the compilation options defined in config.in. In order to build ALYA (Alya.x), please follow these steps after unpack the tar.gz:
You can compile alya using CMake. It follows the classic CMake configuration, except for the compiler management that has been customized by the developers.
Go to to directory: Executables/unix
### Creation of the build directory
```
cd Executables/unix
```
In your alya directory, create a new build directory:
Edit config.in (some default config.in files can be found in directory configure.in):
* Select your own MPI wrappers and paths
* Select size of integers. Default is 4 bytes, For 8 bytes, select -DI8
* Choose your metis version, metis-4.0 or metis-5.1.0_i8 for 8-bytes integers
```
mkdir build
cd build
```
Configure Alya:
### Configuration
./configure -x nastin parall
To configure cmake using the command line, type the following:
Compile metis:
cmake ..
make metis4
If you want to customize the build options, use -DOPTION=value. For example, to enable GPU as it follows:
or
cmake .. -DWITH_GPU=ON
make metis5
### Compilation
Finally, compile Alya:
make -j 8
For more information: https://gitlab.com/bsc-alya/alya/-/wikis/Documentation/Installation
## Mechanics of Running Benchmark
......@@ -59,8 +57,8 @@ The parameters used in the datasets try to represent at best typical industrial
The different datasets are:
SPHERE_16.7M ... 16.7M sphere mesh
SPHERE_132M .... 132M sphere mesh
Test Case A: SPHERE_16.7M ... 16.7M sphere mesh
Test Case B: SPHERE_132M .... 132M sphere mesh
### How to execute Alya with a given dataset
......
This diff is collapsed.
###################################################################
# PGI CONFIGURE #
#POWER9 RECOMENDED MODULE: #
#module load ompi/3.0.0 pgi/18.4 #
###################################################################
F77 = OMPI_FC=pgfortran mpif90
F90 = OMPI_FC=pgfortran mpif90
FCOCC = cc -c
FCFLAGS = -c -fast -Minfo=all -acc -ta=tesla:cuda10.1 -Mpreprocess -I./Objects_x/ -Mbackslash -Mextend -Mnoopenmp -Munroll -Mnoidiom -module $O
FPPFLAGS =
EXTRALIB = -lc
EXTRAINC =
fa2p = pgfortran -c -x f95-cpp-input -DMPI_OFF -J../../Utils/user/alya2pos -I../../Utils/user/alya2pos
fa2plk = pgfortran
###################################################################
# PERFORMANCE FLAGS #
###################################################################
#MINUM
#FOPT = -O1
#MAXIMUM (I have elimated -xHost due to observations by Yacine)
FOPT = -O3
#Compilation flags applied only to the Source/modules folder
#MODULEFLAGS = -ipo
# Uncomment the following line to enable NDIME as a parameter (OPTIMIZATION FOR 3D PROBLEMS)
CSALYA := $(CSALYA) -DNDIMEPAR -DOPENACCHHH -DSUPER_FAST -DDETAILED_TIMES
# Uncomment the following line for DEBUG AND CHECKING FLAGS
#CSALYA := $(CSALYA) -C -Ktrap=fp -Minform=inform
# Vectorization: put vector size (in principle=4 for MN)
CSALYA := $(CSALYA) -DVECTOR_SIZE=32768
###################################################################
# USER SPECIFIC FLAGS #
###################################################################
# HERBERT
#CSALYA := $(CSALYA) -DDETAILS_ORTHOMIN
###################################################################
# PROFILING FLAGS #
###################################################################
# Uncomment the following line to generate profiling info files
#CSALYA := $(CSALYA) -profile-loops=all -profile-loops-report=2
###################################################################
# EXTRAE FLAGS #
###################################################################
# Uncomment the following line to compile Alya using extrae
# Compiler used to compile extrae module (make extrae)
EXTRAE_FC=pgfortran
# Extrae installation directory (for linking) (not necessary if loading extrae using module load extrae)
#EXTRAE_HOME=
#@Linking with Extrae
#EXTRALIB := $(EXTRALIB) -L${EXTRAE_HOME}/lib/ -lmpitracef extrae_module.o
#@Enabling Extrae user calls (normal)
#CSALYA := $(CSALYA) -DEXTRAE
###################################################################
# METIS LIBRARY #
###################################################################
# Uncomment the following lines for using metis 4.0 (default)
EXTRALIB := $(EXTRALIB) -L../../Thirdparties/metis-4.0 -lmetis -acc -ta=tesla:cuda10.1
CSALYA := $(CSALYA) -DMETIS
# Uncomment the following lines for using metis 5.0
#CSALYA := $(CSALYA) -DV5METIS
# uncoment FOR MAC
#EXTRALIB := $(EXTRALIB) -L../../Thirdparties/metis-5.0.2_i8/build/Darwin-i386/libmetis -lmetis
# uncoment FOR LINUX
#EXTRALIB := $(EXTRALIB) -L../../Thirdparties/metis-5.0.2_i8/build/Linux-x86_64/libmetis -lmetis
# Uncomment the following lines for using parmetis
#CSALYA := $(CSALYA) -DPARMETIS
#EXTRALIB := $(EXTRALIB) -L../../Thirdparties/
###################################################################
# BLAS LIBRARY #
###################################################################
# Uncomment the following lines for using blas
#EXTRALIB := $(EXTRALIB) -lblas
#CSALYA := $(CSALYA) -DBLAS
###################################################################
# MPI/THREADS/TASKS #
###################################################################
# Uncomment the following lines for sequential (NOMPI) version
#VERSIONMPI = nompi
#CSALYA := $(CSALYA) -DMPI_OFF
# Uncomment the following lines for OPENMP version
#CSALYA := $(CSALYA) -openmp
#EXTRALIB := $(EXTRALIB) -openmp
# Uncomment the following line to disable OPENMP coloring
#CSALYA := $(CSALYA) -DNO_COLORING
# Uncomment the following line to enable OMPSS in order to activate
# loops with multidependencies whenever we have a race condition
#CSALYA := $(CSALYA) -DALYA_OMPSS
###################################################################
# DLB #
# To use DLB with OMPSS: Define -DALYA_DLB -DALYA_OMPSS #
# To use DLB barrier: Define -DALYA_DLB_BARRIER #
# If we compile with mercurium, there is no need to link #
# with DLB #
# #
###################################################################
#CSALYA := $(CSALYA) -DALYA_DLB
#FCFLAGS := $(FCFLAGS) --dlb
#FPPFLAGS := $(FPPFLAGS) --dlb
###################################################################
# INTEGER TYPE #
###################################################################
# Uncomment the following lines for 8 byte integers
#CSALYA := $(CSALYA) -DI8 -m64
# Uncomment the following line for 8 byte integers ONLY in Metis
# In this mode Alya can work in I4 and Metis in I8 (mixed mode)
# For this option the use of Metis v5 library is mandatory
#CSALYA := $(CSALYA) -DMETISI8
###################################################################
# HDF5 #
# #
#MANDATORY MODULES TO LOAD ON MN3 FOR 4 BYTE INTEGERS VERSION #
#module load HDF5/1.8.14 SZIP #
# #
#MANDATORY MODULES TO LOAD ON MN3 FOR 8 BYTE INTEGERS VERSION #
#module load SZIP/2.1 HDF5/1.8.15p1_static #
# #
#MANDATORY SERVICES TO COMPILE #
#hdfpos #
#MORE INFO:bsccase02.bsc.es/alya/tutorial/hdf5_output.html #
###################################################################
# Uncomment the following lines for using HDF5 IN 4 BYTE INTEGER VERSION
#CSALYA := $(CSALYA) -I/apps/HDF5/1.8.14/INTEL/IMPI/include
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.14/INTEL/IMPI/lib/libhdf5_fortran.a /apps/HDF5/1.8.14/INTEL/IMPI/lib/libhdf5.a
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.14/INTEL/IMPI/lib/libhdf5hl_fortran.a /apps/HDF5/1.8.14/INTEL/IMPI/lib/libhdf5_hl.a
#EXTRALIB := $(EXTRALIB) -L/apps/SZIP/2.1/lib -lsz -L/usr/lib64 -lz -lgpfs
# Uncomment the following lines for using HDF5 IN 8 BYTE INTEGER VERSION
#CSALYA := $(CSALYA) -I/apps/HDF5/1.8.15p1/static/include
#EXTRALIB := $(EXTRALIB) /gpfs/apps/MN3/HDF5/1.8.15p1/static/lib/libhdf5_fortran.a /gpfs/apps/MN3/HDF5/1.8.15p1/static/lib/libhdf5_f90cstub.a
#EXTRALIB := $(EXTRALIB) /gpfs/apps/MN3/HDF5/1.8.15p1/static/lib/libhdf5.a /gpfs/apps/MN3/HDF5/1.8.15p1/static/lib/libhdf5_tools.a
#EXTRALIB := $(EXTRALIB) -L/apps/SZIP/2.1/lib -lsz -L/usr/lib64 -lz -lgpfs
# latest Hdf5 (1.8.16) to be used with latest impi (5.1.2.150) and intel(16.0.1)
# there is module load HDF5/1.8.16-mpi for integers 4 but I have seen it is not necesary to call it
# Toni should clean up this - my reconmendation is to only leave this last version
# moreover there seems to be no need for module load hdf5 (in the i8 case there exists no module load)
# Uncomment the following lines for using HDF5 IN 4 BYTE INTEGER VERSION
#CSALYA := $(CSALYA) -I/apps/HDF5/1.8.16/INTEL/IMPI/include
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.16/INTEL/IMPI/lib/libhdf5_fortran.a /apps/HDF5/1.8.16/INTEL/IMPI/lib/libhdf5hl_fortran.a
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.16/INTEL/IMPI/lib/libhdf5.a /apps/HDF5/1.8.16/INTEL/IMPI/lib/libhdf5_hl.a
#EXTRALIB := $(EXTRALIB) -L/apps/SZIP/2.1/lib -lsz -L/usr/lib64 -lz -lgpfs
# Uncomment the following lines for using HDF5 IN 8 BYTE INTEGER VERSION
#CSALYA := $(CSALYA) -I/apps/HDF5/1.8.16/INTEL/IMPI/int8/include
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.16/INTEL/IMPI/int8/lib/libhdf5_fortran.a /apps/HDF5/1.8.16/INTEL/IMPI/int8/lib/libhdf5hl_fortran.a
#EXTRALIB := $(EXTRALIB) /apps/HDF5/1.8.16/INTEL/IMPI/int8/lib/libhdf5.a /apps/HDF5/1.8.16/INTEL/IMPI/int8/lib/libhdf5_hl.a
#EXTRALIB := $(EXTRALIB) -L/apps/SZIP/2.1/lib -lsz -L/usr/lib64 -lz -lgpfs
###################################################################
# VTK #
# #
#MANDATORY THIRDPARTY COMPILATION #
#Go to Alya/Thirdparties/VTK #
# #
#MORE INFO:bsccase02.bsc.es/alya/tutorial/vtk_output.html #
###################################################################
# Uncomment the following lines for using VTK as output
#CSALYA := $(CSALYA) -DVTK
#EXTRALIB := $(EXTRALIB) ../../Thirdparties/VTK/vtkXMLWriterF.o -L/apps/VTK/6.1.0_patched/lib -lvtkIOXML-6.1 -lvtkIOGeometry-6.1
#EXTRALIB := $(EXTRALIB) -lvtkIOXMLParser-6.1 -lvtksys-6.1 -lvtkIOCore-6.1 -lvtkCommonExecutionModel-6.1 -lvtkCommonDataModel-6.1 -lvtkCommonMisc-6.1
#EXTRALIB := $(EXTRALIB) -lvtkCommonSystem-6.1 -lvtkCommonTransforms-6.1 -lvtkCommonMath-6.1 -lvtkCommonCore-6.1 -lvtkzlib-6.1
#EXTRALIB := $(EXTRALIB) -lvtkjsoncpp-6.1 -lvtkexpat-6.1 -L/gpfs/apps/MN3/INTEL/tbb/lib/intel64/ -ltbb
###################################################################
# NINJA #
#GPU based solvers : GMRES,DEFLATED_CG,CG #
#Specify solver in configuration as: #
#GMRES -------------> GGMR #
#CG ----------------> GCG #
#DEFLATED_CG -------> GDECG #
#GPU Multi Grid-----> GAMGX(Requires CONFIGURATION_FILE in solver)#
#export CUDA_HOME to CUDA version to be used #
###################################################################
# Uncomment the following lines to enable NINJA
#GPU_HOME := ${CUDA_HOME}/
#CSALYA := $(CSALYA) -DNINJA
#EXTRALIB := $(EXTRALIB) -L../../Thirdparties/ninja -L${GPU_HOME}lib64/ -L/lib64 -lninja -lcublas_static -lcublasLt -lcusparse_static -lculibos -lcudart -lpthread -lstdc++ -ldl -lcusolver
# NINJA also support AMGX. Uncomment the following lines and set AMGX_HOME
#AMGX_HOME :=
#CSALYA := $(CSALYA) -DAMGX
#EXTRALIB := $(EXTRALIB) -L${AMGX_HOME}/lib -lamgxsh
###################################################################
# CATALYST #
# #
#MANDATORY THIRDPARTY COMPILATION #
#Go to Alya/Thirdparties/Catalyst #
# #
#MORE INFO:hadrien.calmet at bsc.es #
###################################################################
# Uncomment the following lines for using CATALYST as output
#CSALYA := $(CSALYA) -DVTK -DCATA -mt_mpi -I../../Thirdparties/Catalyst
#EXTRALIB := $(EXTRALIB) ../../Thirdparties/VTK/vtkXMLWriterF.o ../../Thirdparties/Catalyst/FEFortranAdaptor.o ../../Thirdparties/Catalyst/FECxxAdaptor.o
#EXTRALIB := $(EXTRALIB) -L/apps/PARAVIEW/4.2.0/lib/paraview-4.2/ -lvtkIOXML-pv4.2 -lvtkIOGeometry-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkIOXMLParser-pv4.2 -lvtksys-pv4.2 -lvtkIOCore-pv4.2 -lvtkCommonExecutionModel-pv4.2 -lvtkCommonDataModel-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkCommonMisc-pv4.2 -lvtkCommonSystem-pv4.2 -lvtkCommonTransforms-pv4.2 -lvtkCommonMath-pv4.2 -lvtkCommonCore-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkzlib-pv4.2 -lvtkjsoncpp-pv4.2 -lvtkexpat-pv4.2 -lvtkPVPythonCatalyst-pv4.2 -lvtkPVCatalyst-pv4.2 -lvtkPythonInterpreter-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkUtilitiesPythonInitializer-pv4.2 -lvtkPVServerManagerCore-pv4.2 -lvtkPVServerImplementationCore-pv4.2 -lvtkPVClientServerCoreCore-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkFiltersParallel-pv4.2 -lvtkFiltersModeling-pv4.2 -lvtkRenderingCore-pv4.2 -lvtkFiltersExtraction-pv4.2 -lvtkFiltersStatistics-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkImagingFourier-pv4.2 -lvtkImagingCore-pv4.2 -lvtkalglib-pv4.2 -lvtkFiltersGeometry-pv4.2 -lvtkFiltersSources-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkFiltersGeneral-pv4.2 -lvtkCommonComputationalGeometry-pv4.2 -lvtkFiltersProgrammable-pv4.2 -lvtkPVVTKExtensionsCore-pv4.2
#EXTRALIB := $(EXTRALIB) -lvtkPVCommon-pv4.2 -lvtkClientServer-pv4.2 -lvtkFiltersCore-pv4.2 -lvtkParallelMPI-pv4.2 -lvtkParallelCore-pv4.2 -lvtkIOLegacy-pv4.2
#EXTRALIB := $(EXTRALIB) -lprotobuf -lvtkWrappingPython27Core-pv4.2 -lvtkpugixml-pv4.2 -lvtkPVServerManagerApplication-pv4.2 -L/gpfs/apps/MN3/INTEL/tbb/lib/intel64/ -ltbb
###################################################################
# EoCoE Flags #
###################################################################
# Uncomment the following lines to output matrices #
#CSALYA := $(CSALYA) -Doutmateocoe
# Uncomment the following lines to solve with AGMG #
#MANDATORY MODULES TO LOAD ON MN3
#module load MKL/11.3
# serial version -- this is obsolete actually for it to work you need to touch the ifdef in dagmg.f90
#CSALYA := $(CSALYA) -Dsolve_w_agmg -I${MKLROOT}/include
#EXTRALIB := $(EXTRALIB) -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential
#
#
# parallel version academic
#CSALYA := $(CSALYA) -DPARAL_AGMG -DPARAL_AGMG_ACAD -I${MKLROOT}/include -I/gpfs/projects/bsc21/WORK-HERBERT/svnmn3/MUMPS_5.0.1/include
#For Mumps
#EXTRALIB := $(EXTRALIB) -L/gpfs/projects/bsc21/WORK-HERBERT/svnmn3/MUMPS_5.0.1/lib -ldmumps -lmumps_common -lpord
#For agmg
#EXTRALIB := $(EXTRALIB) -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -lmkl_blacs_intelmpi_ilp64 -lmkl_blacs_intelmpi_lp64 -lmkl_scalapack_ilp64 -lmkl_scalapack_lp64
#
# parallel version prof (copy dagmg_par.o from Thirdparties/agmg)
#CSALYA := $(CSALYA) -DPARAL_AGMG -Duse_dagmg_mumps -I${MKLROOT}/include
#For agmg
#EXTRALIB := $(EXTRALIB) -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -lmkl_blacs_intelmpi_ilp64 -lmkl_blacs_intelmpi_lp64 -lmkl_scalapack_ilp64 -lmkl_scalapack_lp64 -L../../Thirdparties/agmg/ -ldagmg_par
###################################################################
# SUNDIALS CVODE #
###################################################################