README.md 6.67 KB
Newer Older
Janko's avatar
Janko committed
1
# Alya - Large Scale Computational Mechanics
Victor's avatar
Victor committed
2
3
4
5
6
7
8
9
10
11
12
13
14
15

Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.

Homepage: https://www.bsc.es/research-development/research-areas/engineering-simulations/alya-high-performance-computational

Alya is avaialble to collaboratoring projects and a specific version is being distributed as part of the PRACE Unified European Applications Benchmark Suite (http://www.prace-ri.eu/ueabs/#ALYA)

## Building Alya for GPU accelerators

The library currently supports four solvers:GMRES, Deflated Conjugate Gradient, Conjugate Gradient, and Pipelined Conjugate Gradient.
The only pre-conditioner supported at the moment is 'diagonal'.

Keywords to use the solvers:

Janko's avatar
Janko committed
16
```shell
Victor's avatar
Victor committed
17
18
19
20
21
22
NINJA GMRES               : GGMR
NINJA Deflated CG         : GDECG
NINJA CG                  : GCG
NINJA Pipelined CG        : GPCG

PRECONDITIONER            : DIAGONAL
Janko's avatar
Janko committed
23
```
Victor's avatar
Victor committed
24
25
26
27
Other options are same a CPU based solver.

### GPGPU Building

Janko's avatar
Janko committed
28
This version was tested with the Intel Compilers 2017.1, bullxmpi-1.2.9.1 and NVIDIA CUDA 7.5. Ensure that the wrappers `mpif90` and `mpicc` point to the correct binaries and that `$CUDA_HOME` is set.
Victor's avatar
Victor committed
29
30
31

Alya can be used with just MPI or hybrid MPI-OpenMP parallelism. Standard execution mode is to rely on MPI only.

Janko's avatar
Janko committed
32
 - Uncompress the source and configure the depending Metis library and Alya build options:
Janko's avatar
Janko committed
33

Janko's avatar
Janko committed
34
35
36
```shell
   tar xvf  alya-prace-acc.tar.bz2
```
Janko's avatar
Janko committed
37

Janko's avatar
Janko committed
38
 -  Edit the file `Alya/Thirdparties/metis-4.0/Makefile.in` to select the compiler and target platform. Uncomment the specific lines and add optimization parameters, e.g.
Janko's avatar
Janko committed
39

Janko's avatar
Janko committed
40
41
42
```shell
  OPTFLAGS = -O3 -xCORE-AVX2
```
Janko's avatar
Janko committed
43

Janko's avatar
Janko committed
44
 -  Then build Metis4 
Janko's avatar
Janko committed
45

Janko's avatar
Janko committed
46
47
48
49
```shell
  $ cd Alya/Executables/unix
  $ make metis4
```
Janko's avatar
Janko committed
50

Janko's avatar
Janko committed
51
 - For Alya there are several example configurations, copy one, e.g. for Intel Compilers:
Janko's avatar
Janko committed
52

Janko's avatar
Janko committed
53
```shell
Janko's avatar
Janko committed
54
  $ cp configure.in/config_ifort.in config.in
Janko's avatar
Janko committed
55
```
Janko's avatar
Janko committed
56

Janko's avatar
Janko committed
57
 - Edit the config.in:
Janko's avatar
Janko committed
58
59
  Add the corresponding platform optimization flags to `FCFLAGS`, e.g. 

Janko's avatar
Janko committed
60
61
62
63
```shell
  FCFLAGS  = -module $O -c -xCORE-AVX2
```
 - MPI: No changes in the configure file are necessary. By default you use metis4 and 4 byte integers.
Janko's avatar
Janko committed
64
 - MPI-hybrid (with OpenMP) : Uncomment the following lines for OpenMP version:
Janko's avatar
Janko committed
65

Janko's avatar
Janko committed
66
```shell
Victor's avatar
Victor committed
67
68
              CSALYA     := $(CSALYA)   -qopenmp (-fopenmp for GCC Compilers)
              EXTRALIB   := $(EXTRALIB) -qopenmp (-fopenmp for gcc Compilers)
Janko's avatar
Janko committed
69
```
Janko's avatar
Janko committed
70
71
 - Configure and build Alya (-x Release version; -g Debug version, plus uncommenting debug and checking flags in config.in)

Janko's avatar
Janko committed
72
73
74
75
```shell
 ./configure -x nastin parall
 make NINJA=1 -j num_processors
```
Victor's avatar
Victor committed
76
77
78

### GPGPU Usage

Janko's avatar
Janko committed
79
Each problem needs a `GPUconfig.dat`. A sample is available at `Alya/Thirdparties/ninja` and needs to be copied to the work directory. A README file in the same location provides further information.
Victor's avatar
Victor committed
80

Janko's avatar
Janko committed
81
82
83
84
85
86
87
88
 - Extract the small one node test case and configure to use GPU solvers:

```shell
 $ tar xvf cavity1_hexa_med.tar.bz2 && cd cavity1_hexa_med
 $ cp ../Alya/Thirdparties/ninja/GPUconfig.dat .
```

 - To use the GPU, you have to replace `GMRES` with `GGMR` and `DEFLATED_CG` with `GDECG`, both in `cavity1_hexa.nsi.dat`
Janko's avatar
Janko committed
89
 - Edit the job script to submit the calculation to the batch system. 
Janko's avatar
Janko committed
90

Janko's avatar
Janko committed
91
```shell
Janko's avatar
Janko committed
92
 job.sh: Modify the path where you have your Alya.x (compiled with MPI options)
Janko's avatar
Janko committed
93
94
 sbatch job.sh
```
Janko's avatar
Janko committed
95
 Alternatively execute directly: 
Victor's avatar
Victor committed
96

Janko's avatar
Janko committed
97
98
99
100
101
102
```shell
OMP_NUM_THREADS=4 mpirun -np 16 Alya.x cavity1_hexa
```

<!--    Runtime on 16-core Xeon E5-2630 v3 @ 2.40GHz with 2 NVIDIA K80: ~1:30 min -->
<!--    Runtime on 16-core Xeon E5-2630 v3 @ 2.40GHz no GPU:            ~2:00 min -->
Victor's avatar
Victor committed
103
104
105
106
107
108
109
110
111
112


## Building Alya for Intel Xeon Phi Knights Landing (KNL)


The Xeon Phi processor version of Alya is currently relying on compiler assisted optimization for AVX-512. Porting of performance critical kernels to the new assembly instructions is underway. There will not be a version for first generation Xeon Phi Knights Corner coprocessors.

### KNL Building


Janko's avatar
Janko committed
113
This version was tested with the Intel Compilers 2017.1, Intel MPI 2017.1. Ensure that the wrappers `mpif90` and `mpicc` point to the correct binaries.
Victor's avatar
Victor committed
114
115
116

Alya can be used with just MPI or hybrid MPI-OpenMP parallelism. Standard execution mode is to rely on MPI only.

Janko's avatar
Janko committed
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
 - Uncompress the source and configure the depending Metis library and Alya build options:

```shell
   tar xvf  alya-prace-acc.tar.bz2
```

 -  Edit the file `Alya/Thirdparties/metis-4.0/Makefile.in` to select the compiler and target platform. Uncomment the specific lines and add optimization parameters, e.g.

```shell
  OPTFLAGS = -O3 -xMIC-AVX512
```

 -  Then build Metis4

```shell
  $ cd Alya/Executables/unix
  $ make metis4
```

 - For Alya there are several example configurations, copy one, e.g. for Intel Compilers:

```shell
  $ cp configure.in/config_ifort.in config.in
```

 - Edit the config.in:
  Add the corresponding platform optimization flags to `FCFLAGS`, e.g.

```shell
  FCFLAGS  = -module $O -c -xMIC-AVX512
```

 - MPI: No changes in the configure file are necessary. By default you use metis4 and 4 byte integers.
Janko's avatar
Janko committed
150
 - MPI-hybrid (with OpenMP) : Uncomment the following lines for OpenMP version:
Janko's avatar
Janko committed
151
152

```shell
Victor's avatar
Victor committed
153
154
              CSALYA     := $(CSALYA)   -qopenmp (-fopenmp for GCC Compilers)
              EXTRALIB   := $(EXTRALIB) -qopenmp (-fopenmp for gcc Compilers)
Janko's avatar
Janko committed
155
156
```
 - Configure and build Alya (-x Release version; -g Debug version, plus uncommenting debug and checking flags in config.in)
Victor's avatar
Victor committed
157

Janko's avatar
Janko committed
158
159
160
161
```shell
 ./configure -x nastin parall
 make -j num_processors
```
Victor's avatar
Victor committed
162
163
164

### KNL Usage

Janko's avatar
Janko committed
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
 - Extract the small one node test case.

```shell
 $ tar xvf cavity1_hexa_med.tar.bz2 && cd cavity1_hexa_med
 $ cp ../Alya/Thirdparties/ninja/GPUconfig.dat .
```

 - Edit the job script to submit the calculation to the batch system.

```shell
 job.sh: Modify the path where you have your Alya.x (compiled with MPI options)
 sbatch job.sh
```
 Alternatively execute directly:

```shell
OMP_NUM_THREADS=4 mpirun -np 16 Alya.x cavity1_hexa
```

<!--    Runtime on 68-core Xeon Phi(TM) CPU 7250 1.40GHz: ~3:00 min -->
Victor's avatar
Victor committed
185
186
187
188
189
190


## Remarks


If the number of elements is too low for a scalability analysis, Alya includes a mesh multiplication technique. This tool can be used by selecting an input option in the ker.dat file. This option is the number of mesh multiplication levels one wants to apply (0 meaning no mesh multiplication). At each multiplication level, the number of elements is multiplied by 8, so one can obtain a huge mesh automatically in order to study the scalability of the code on different architectures. Note that the mesh multiplication is carried out in parallel and thus should not impact the duration of the simulation process.