Commit f2c3dfbd authored by Andrew Emerson's avatar Andrew Emerson
Browse files

README

parent ec0bc6f5
...@@ -4,25 +4,29 @@ ...@@ -4,25 +4,29 @@
## Contents ## Contents
1. Introduction 1.
2. Requirements
3. Downloading the software
4. Compiling the application
5. Running the program
6. Example
7. References 7. References
## 1. Introduction ## 1. Introduction
Quantum Espresso is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.
### Standard CPU version
For the UEABS activity we have used mainly version v6.0 but later versions are now available.
### GPU version ### GPU version
The GPU port of Quantum Espresso is a version of the program which has been The GPU port of Quantum Espresso is a version of the program which has been
completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used in these
experiments is v6.0, even though further versions becamse available later during the experiments is v6.0, even though further versions becamse available later during the
activity. activity.
### 2. Build Requirements ## 2. Installation and requirements
### Standard
The Quantum Espresso source can be downloaded from the projects GitHub repository,[QE](https://github.com/QEF/q-e/tags). Requirements can be found from the website but you will need a good FORTRAN and C compiler with an MPI library and optionally (but highly recommended) an optimised linear algebra library.
### GPU version
For complete build requirements and information see the following GitHub site: For complete build requirements and information see the following GitHub site:
[QE-GPU](https://github.com/fspiga/qe-gpu) [QE-GPU](https://github.com/fspiga/qe-gpu)
A short summary is given below: A short summary is given below:
...@@ -40,9 +44,16 @@ Optional ...@@ -40,9 +44,16 @@ Optional
with the distribution. with the distribution.
### 3. Downloading the software ##3. Downloading the software
### Standard
From the website, for example:
```bash
wget https://github.com/QEF/q-e/releases/download/qe-6.3/qe-6.3.tar.gz
```
Available from the web site given above. You can use, for example, ``git clone`` ### GPU
Available from the web site given above. You can use, for example, ```git clone```
to download the software: to download the software:
```bash ```bash
git clone https://github.com/fspiga/qe-gpu.git git clone https://github.com/fspiga/qe-gpu.git
...@@ -50,6 +61,20 @@ git clone https://github.com/fspiga/qe-gpu.git ...@@ -50,6 +61,20 @@ git clone https://github.com/fspiga/qe-gpu.git
### 4. Compiling and installing the application ### 4. Compiling and installing the application
### Standard installation
Installation is achieved by the usual ```configure, make, make install ``` procedure.
However, it is recommended that the user checks the __make.inc__ file created by this procedure before performing the make.
For example, using the Intel compilers,
```bash
module load intel intelmpi
CC=icc FC=ifort MPIF90=mpiifort ./configure --enable-openmp --with-scalapack=intel
```
Assuming the __make.inc__ file is acceptable, the user can then do:
```bash
make; make install
```
### GPU
Check the __README.md__ file in the downloaded files since the Check the __README.md__ file in the downloaded files since the
procedure varies from distribution to distribution. procedure varies from distribution to distribution.
Most distributions do not have a ```configure``` command. Instead you copy a __make.inc__ Most distributions do not have a ```configure``` command. Instead you copy a __make.inc__
...@@ -71,7 +96,8 @@ make pw ...@@ -71,7 +96,8 @@ make pw
The QE-GPU executable will appear in the directory `GPU/PW` and is called `pw-gpu.x`. The QE-GPU executable will appear in the directory `GPU/PW` and is called `pw-gpu.x`.
## 5. Running the program ##
Running the program - general procedure
Of course you need some input before you can run calculations. The Of course you need some input before you can run calculations. The
input files are of two types: input files are of two types:
...@@ -105,21 +131,44 @@ but check your system documentation since mpirun may be replaced by ...@@ -105,21 +131,44 @@ but check your system documentation since mpirun may be replaced by
allowed to run MPI programs interactively without using the allowed to run MPI programs interactively without using the
batch system. batch system.
A couple of examples for PRACE systems are given in the next section. ### Parallelisation options
Quantum Espresso uses various levels of parallelisation, the most important being MPI parallelisation
over the k points available in the input system. This is achieved with the ```-npool``` program option.
Thus for the AUSURF input which has 2 k points we can run:
```bash
srun -n 64 pw.x -npool 2 -input pw.in
```
which would allocate 32 MPI tasks per k-point.
The number of MPI tasks must be a multiple of the number of k-points. For the TA2O5 input, which has 26 k-points, we could try:
```bash
srun -n 52 pw.x -npool 26 -input pw.in
```
but we may wish to use fewer pools but with more tasks per pool:
```bash
srun -n 52 pw.x -npool 13 -input pw.in
```
#### Use of ndiag
### Hints for running the GPU version ### Hints for running the GPU version
#### Memory
The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted
by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets. by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets.
For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS). For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS).
## 6. Examples
Example job scripts for various supercomputer systems in PRACE are available in the repository.
### Computer System: DAVIDE P100 cluster, cineca ## Execution
In the UEABS repository you will find a directory for each computer system tested, together with installation
instructions and job scripts.
In the following we describe in detail the execution procedure for the Marconi computer system.
#### Running ### Execution on the Cineca Marconi KNL system
Quantum Espresso has already been installed for the KNL nodes of Quantum Espresso has already been installed for the KNL nodes of
Marconi and can be accessed via a specific module: Marconi and can be accessed via a specific module:
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment