The parameters used in the datasets try to represent at best typical industrial runs in order to obtain representative speedups. For example, the iterative solvers
are never converged to machine accuracy, as the system solution is inside a non-linear loop.
The datasets represent the solution of the cavity flow at Re=100. A small mesh of 10M elements should be used for Tier-1 supercomputers while a 30M element mesh
is specifically designed to run on Tier-0 supercomputers.
However, the number of elements can be multiplied by using the mesh multiplication option in the file *.ker.dat (DIVISION=0,2,3...). The mesh multiplication is
carried out in parallel and the numebr of elements is multiplied by 8 at each of these levels. "0" means no mesh multiplication.
The different datasets are:
cavity10_tetra ... 10M tetrahedra mesh
cavity30_tetra ... 30M tetrahedra mesh
SPHERE_16.7M ... 16.7M sphere mesh
SPHERE_132M ... 132M sphere mesh
How to execute Alya with a given dataset
----------------------------------------
...
...
@@ -20,30 +15,43 @@ How to execute Alya with a given dataset
In order to run ALYA, you need at least the following input files per execution:
X.dom.dat
X.typ.dat
X.geo.dat
X.bcs.dat
X.inflow_profile.bcs
X.ker.dat
X.nsi.dat
X.dat
In our case, there are 2 different inputs, so X={cavity10_tetra,cavity30_tetra}
To execute a simulation, you must be inside the input directory and you should submit a job like:
In our case X=sphere
mpirun Alya.x cavity10_tetra
or
mpirun Alya.x cavity30_tetra
To execute a simulation, you must be inside the input directory and you should submit a job like:
mpirun Alya.x sphere
How to measure the speedup
--------------------------
1. Edit the fensap.nsi.cvg file
2. You will see ten rows, each one corresponds to one simulation timestep
3. Go to the second row, it starts with a number 2
4. Get the last number of this row, that corresponds to the elapsed CPU time of this timestep
5. Use this value in order to measure the speedup
There are many ways to compute the scalability of Nastin module.
1. For the complete cycle including element assembly, boundary assembly, subgrid scale assembly, solvers, etc.
2. For single kernels: element assembly, boundary assembly, subgrid scale assembly, solvers
3. Using overall times
1. In *.nsi.cvg file, colum "30. Elapsed CPU time"
2. Single kernels. Here, average and maximum times are indicated in *.nsi.cvg at each iteration of each time step:
Element assembly: 19. Ass. ave cpu time 20. Ass. max cpu time
Boundary assembly: 33. Bou. ave cpu time 34. Bou. max cpu time
Subgrid scale assembly: 31. SGS ave cpu time 32. SGS max cpu time
Iterative solvers: 21. Sol. ave cpu time 22. Sol. max cpu time
3. At the end of *.log file, total timings are shown for all modules. In this case we use the first value of the NASTIN MODULE.