diff --git a/alya/ALYA_Build_README.txt b/alya/ALYA_Build_README.txt index 71aa30f0420dfdcf17125a7c332794d2d2509fd4..bd971b53b3abf7fcabc62a991e71da3fe0c51d52 100644 --- a/alya/ALYA_Build_README.txt +++ b/alya/ALYA_Build_README.txt @@ -1,8 +1,8 @@ In order to build ALYA (Alya.x), please follow these steps: -- Go to: Thirdparties/metis-4.0 and build the Metis library (libmetis.a) using 'make' - Go to the directory: Executables/unix -- Adapt the file: configure-marenostrum-mpi.txt to your own MPI wrappers and paths +- Build the Metis library (libmetis.a) using "make metis4" +- Adapt the file: configure.in to your own MPI wrappers and paths (examples on the configure.in folder) - Execute: - ./configure -x -f=configure-marenostrum-mpi.txt nastin parall + ./configure -x nastin parall make diff --git a/alya/ALYA_Run_README.txt b/alya/ALYA_Run_README.txt index 20c2ae7deb9ef34f1a60273f9fa4e7283317d1ed..d5754d8ca82c05e45220997b7def401a44980e27 100644 --- a/alya/ALYA_Run_README.txt +++ b/alya/ALYA_Run_README.txt @@ -4,15 +4,10 @@ Data sets The parameters used in the datasets try to represent at best typical industrial runs in order to obtain representative speedups. For example, the iterative solvers are never converged to machine accuracy, as the system solution is inside a non-linear loop. -The datasets represent the solution of the cavity flow at Re=100. A small mesh of 10M elements should be used for Tier-1 supercomputers while a 30M element mesh -is specifically designed to run on Tier-0 supercomputers. -However, the number of elements can be multiplied by using the mesh multiplication option in the file *.ker.dat (DIVISION=0,2,3...). The mesh multiplication is -carried out in parallel and the numebr of elements is multiplied by 8 at each of these levels. "0" means no mesh multiplication. - The different datasets are: -cavity10_tetra ... 10M tetrahedra mesh -cavity30_tetra ... 30M tetrahedra mesh +SPHERE_16.7M ... 16.7M sphere mesh +SPHERE_132M ... 132M sphere mesh How to execute Alya with a given dataset ---------------------------------------- @@ -20,30 +15,43 @@ How to execute Alya with a given dataset In order to run ALYA, you need at least the following input files per execution: X.dom.dat -X.typ.dat -X.geo.dat -X.bcs.dat -X.inflow_profile.bcs X.ker.dat X.nsi.dat X.dat -In our case, there are 2 different inputs, so X={cavity10_tetra,cavity30_tetra} -To execute a simulation, you must be inside the input directory and you should submit a job like: +In our case X=sphere -mpirun Alya.x cavity10_tetra -or -mpirun Alya.x cavity30_tetra +To execute a simulation, you must be inside the input directory and you should submit a job like: +mpirun Alya.x sphere How to measure the speedup -------------------------- -1. Edit the fensap.nsi.cvg file -2. You will see ten rows, each one corresponds to one simulation timestep -3. Go to the second row, it starts with a number 2 -4. Get the last number of this row, that corresponds to the elapsed CPU time of this timestep -5. Use this value in order to measure the speedup +There are many ways to compute the scalability of Nastin module. + +1. For the complete cycle including element assembly, boundary assembly, subgrid scale assembly, solvers, etc. + +2. For single kernels: element assembly, boundary assembly, subgrid scale assembly, solvers + +3. Using overall times + + +1. In *.nsi.cvg file, colum "30. Elapsed CPU time" + + +2. Single kernels. Here, average and maximum times are indicated in *.nsi.cvg at each iteration of each time step: + +Element assembly: 19. Ass. ave cpu time 20. Ass. max cpu time + +Boundary assembly: 33. Bou. ave cpu time 34. Bou. max cpu time + +Subgrid scale assembly: 31. SGS ave cpu time 32. SGS max cpu time + +Iterative solvers: 21. Sol. ave cpu time 22. Sol. max cpu time + + +3. At the end of *.log file, total timings are shown for all modules. In this case we use the first value of the NASTIN MODULE. Contact ------- diff --git a/alya/README_ACC.md b/alya/README_ACC.md index 868598aeca27586d2c57c94c928dadd81744584f..d6ad50784bf4f3baf0269ec8c1c784f123162962 100644 --- a/alya/README_ACC.md +++ b/alya/README_ACC.md @@ -160,29 +160,6 @@ Alya can be used with just MPI or hybrid MPI-OpenMP parallelism. Standard execut make -j num_processors ``` -### KNL Usage - - - Extract the small one node test case. - -```shell - $ tar xvf cavity1_hexa_med.tar.bz2 && cd cavity1_hexa_med - $ cp ../Alya/Thirdparties/ninja/GPUconfig.dat . -``` - - - Edit the job script to submit the calculation to the batch system. - -```shell - job.sh: Modify the path where you have your Alya.x (compiled with MPI options) - sbatch job.sh -``` - Alternatively execute directly: - -```shell -OMP_NUM_THREADS=4 mpirun -np 16 Alya.x cavity1_hexa -``` - - - ## Remarks