Quantum Espresso is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.
### Standard CPU version
For the UEABS activity we have used mainly version v6.0 but later versions are now available.
### GPU version
### GPU version
The GPU port of Quantum Espresso is a version of the program which has been
The GPU port of Quantum Espresso is a version of the program which has been
completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these
completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these
experiments is v6.0, even though further versions becamse available later during the
experiments is v6.0, even though further versions becamse available later during the
activity.
activity.
### 2. Build Requirements
## 2. Installation and requirements
### Standard
The Quantum Espresso source can be downloaded from the projects GitHub repository,[QE](https://github.com/QEF/q-e/tags). Requirements can be found from the website but you will need a good FORTRAN and C compiler with an MPI library and optionally (but highly recommended) an optimised linear algebra library.
### GPU version
For complete build requirements and information see the following GitHub site:
For complete build requirements and information see the following GitHub site:
Assuming the __make.inc__ file is acceptable, the user can then do:
```bash
make; make install
```
### GPU
Check the __README.md__ file in the downloaded files since the
Check the __README.md__ file in the downloaded files since the
procedure varies from distribution to distribution.
procedure varies from distribution to distribution.
Most distributions do not have a ```configure``` command. Instead you copy a __make.inc__
Most distributions do not have a ```configure``` command. Instead you copy a __make.inc__
...
@@ -71,7 +96,8 @@ make pw
...
@@ -71,7 +96,8 @@ make pw
The QE-GPU executable will appear in the directory `GPU/PW` and is called `pw-gpu.x`.
The QE-GPU executable will appear in the directory `GPU/PW` and is called `pw-gpu.x`.
## 5. Running the program
##
Running the program - general procedure
Of course you need some input before you can run calculations. The
Of course you need some input before you can run calculations. The
input files are of two types:
input files are of two types:
...
@@ -105,21 +131,44 @@ but check your system documentation since mpirun may be replaced by
...
@@ -105,21 +131,44 @@ but check your system documentation since mpirun may be replaced by
allowed to run MPI programs interactively without using the
allowed to run MPI programs interactively without using the
batch system.
batch system.
A couple of examples for PRACE systems are given in the next section.
### Parallelisation options
Quantum Espresso uses various levels of parallelisation, the most important being MPI parallelisation
over the k points available in the input system. This is achieved with the ```-npool``` program option.
Thus for the AUSURF input which has 2 k points we can run:
```bash
srun -n 64 pw.x -npool 2 -input pw.in
```
which would allocate 32 MPI tasks per k-point.
The number of MPI tasks must be a multiple of the number of k-points. For the TA2O5 input, which has 26 k-points, we could try:
```bash
srun -n 52 pw.x -npool 26 -input pw.in
```
but we may wish to use fewer pools but with more tasks per pool:
```bash
srun -n 52 pw.x -npool 13 -input pw.in
```
#### Use of ndiag
### Hints for running the GPU version
### Hints for running the GPU version
#### Memory
The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted
The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted
by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets.
by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets.
For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS).
For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS).
## 6. Examples
Example job scripts for various supercomputer systems in PRACE are available in the repository.
### Computer System: DAVIDE P100 cluster, cineca
## Execution
In the UEABS repository you will find a directory for each computer system tested, together with installation
instructions and job scripts.
In the following we describe in detail the execution procedure for the Marconi computer system.
#### Running
### Execution on the Cineca Marconi KNL system
Quantum Espresso has already been installed for the KNL nodes of
Quantum Espresso has already been installed for the KNL nodes of
Marconi and can be accessed via a specific module:
Marconi and can be accessed via a specific module: