Commit ec0bc6f5 authored by Andrew Emerson's avatar Andrew Emerson
Browse files

README mods

parent fc23c18b
...@@ -14,12 +14,14 @@ ...@@ -14,12 +14,14 @@
## 1. Introduction ## 1. Introduction
### GPU version
The GPU port of Quantum Espresso is a version of the program which has been The GPU port of Quantum Espresso is a version of the program which has been
completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these
experiments is v6.0, even though further versions becamse available later during the experiments is v6.0, even though further versions becamse available later during the
activity. activity.
## 2. Build Requirements ### 2. Build Requirements
For complete build requirements and information see the following GitHub site: For complete build requirements and information see the following GitHub site:
[QE-GPU](https://github.com/fspiga/qe-gpu) [QE-GPU](https://github.com/fspiga/qe-gpu)
...@@ -38,7 +40,7 @@ Optional ...@@ -38,7 +40,7 @@ Optional
with the distribution. with the distribution.
## 3. Downloading the software ### 3. Downloading the software
Available from the web site given above. You can use, for example, ``git clone`` Available from the web site given above. You can use, for example, ``git clone``
to download the software: to download the software:
...@@ -46,7 +48,7 @@ to download the software: ...@@ -46,7 +48,7 @@ to download the software:
git clone https://github.com/fspiga/qe-gpu.git git clone https://github.com/fspiga/qe-gpu.git
``` ```
## 4. Compiling and installing the application ### 4. Compiling and installing the application
Check the __README.md__ file in the downloaded files since the Check the __README.md__ file in the downloaded files since the
procedure varies from distribution to distribution. procedure varies from distribution to distribution.
...@@ -100,13 +102,18 @@ mpirun -n 16 pw-gpu.x -input pw.in ...@@ -100,13 +102,18 @@ mpirun -n 16 pw-gpu.x -input pw.in
but check your system documentation since mpirun may be replaced by but check your system documentation since mpirun may be replaced by
`mpiexec, runjob, aprun, srun,` etc. Note also that normally you are not `mpiexec, runjob, aprun, srun,` etc. Note also that normally you are not
allowed to run MPI programs interactively but must instead use the allowed to run MPI programs interactively without using the
batch system. batch system.
A couple of examples for PRACE systems are given in the next section. A couple of examples for PRACE systems are given in the next section.
### Hints for running the GPU version
The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted
by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets.
For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS).
## 6. Examples ## 6. Examples
We now give a build and 2 run examples. Example job scripts for various supercomputer systems in PRACE are available in the repository.
### Computer System: DAVIDE P100 cluster, cineca ### Computer System: DAVIDE P100 cluster, cineca
...@@ -128,20 +135,20 @@ haven't been substantailly tested for Quantum Espresso (e.g. flat ...@@ -128,20 +135,20 @@ haven't been substantailly tested for Quantum Espresso (e.g. flat
mode) but significant differences in performance for most inputs are mode) but significant differences in performance for most inputs are
not expected. not expected.
An example PBS batch script for the A2 partition is given below: An example SLURM batch script for the A2 partition is given below:
``` shell ``` shell
#!/bin/bash #!/bin/bash
#PBS -l walltime=06:00:00 #SBATCH -N2
#PBS -l select=2:mpiprocs=34:ncpus=68:mem=93gb #SBATCH --tasks-per-node=34
#PBS -A <your account_no> #SBATCH -A <accountno>
#PBS -N jobname #SBATCH -t 1:00:00
module purge module purge
module load profile/knl module load profile/knl
module load autoload qe/6.0_knl module load autoload qe/6.0_knl
cd ${PBS_O_WORKDIR}
export OMP_NUM_THREADS=4 export OMP_NUM_THREADS=4
export MKL_NUM_THREADS=${OMP_NUM_THREADS} export MKL_NUM_THREADS=${OMP_NUM_THREADS}
...@@ -150,7 +157,7 @@ mpirun pw.x -npool 4 -input file.in > file.out ...@@ -150,7 +157,7 @@ mpirun pw.x -npool 4 -input file.in > file.out
``` ```
In the above with the PBS directives we have asked for 2 KNL nodes (each with 68 cores) in In the above with the SLURM directives we have asked for 2 KNL nodes (each with 68 cores) in
cache/quadrant mode and 93 Gb main memory each. We are running QE in cache/quadrant mode and 93 Gb main memory each. We are running QE in
hybrid mode using 34 MPI processes/node, each with 4 OpenMP hybrid mode using 34 MPI processes/node, each with 4 OpenMP
threads/process and distributing the k-points in 4 pools; the Intel threads/process and distributing the k-points in 4 pools; the Intel
...@@ -160,7 +167,7 @@ Note that this script needs to be submitted using the KNL scheduler as follows: ...@@ -160,7 +167,7 @@ Note that this script needs to be submitted using the KNL scheduler as follows:
``` shell ``` shell
module load env-knl module load env-knl
qsub myjob sbatch myjob
``` ```
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment