Commit a961cabe authored by Andrew Emerson's avatar Andrew Emerson
Browse files

README + sys files

parent 6c6a6778
#!/bin/bash
#SBATCH --nodes=50
#SBATCH --ntasks=650
#SBATCH --ntasks-per-node=13
#SBATCH --cpus-per-task=4
#SBATCH --output=ta205-out.%j
#SBATCH --error=ta205-err.%j
#SBATCH --mem=90GB
#SBATCH --time=01:30:00
#module load Intel IntelMPI imkl
module load intel-para/2018b-mt
QE_HOME=$HOME/q-e-qe-6.3
export OMP_NUM_THREADS=4
srun -n 650 $QE_HOME/bin/pw.x -npool 26 -input Ta2O5-2x2xz-552.in
#!/bin/bash
#
# Example job script for AUSURF (2 k points)
#SBATCH --nodes=1
#SBATCH --ntasks=48
#SBATCH --ntasks-per-node=48
#SBATCH --output=mpi-out.%j
#SBATCH --error=mpi-err.%j
#SBATCH --time=01:00:00
#SBATCH --partition=batch
module load Intel IntelMPI imkl
QE_HOME=$HOME/q-e-qe-6.3
srun $QE_HOME/bin/pw.x -npool 2 -ndiag 16
#!/bin/bash
#
# QuantumESPRESSO on Piz Daint: 8 nodes, 12 MPI tasks per node,
# 2 OpenMP threads per task using hyperthreading (--ntasks-per-core=2)
#
#SBATCH --job-name=espresso
#SBATCH --time=01:00:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-core=2
#SBATCH --ntasks-per-node=12
#SBATCH --cpus-per-task=2
#SBATCH --constraint=gpu
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
ulimit -s unlimited
srun pw.x -npool 2 -in input.in
#!/bin/bash
# Load the following modules
#module swap PrgEnv-cray PrgEnv-pgi
# module load intel cudatoolkit
# and then install with the make.inc in the QE-GPU distribution
# Quantum Espresso in the Accelerated Benchmark Suite
# Quantum Espresso in the United European Applications Benchmark Suite (UEABS)
## Document Author: A. Emerson (a.emerson@cineca.it) , Cineca.
## Contents
1.
7. References
## 1. Introduction
## Introduction
Quantum Espresso is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.
Full documentation is available from the project website [QuantumEspresso](https://www.quantum-espresso.org/).
In this README we give information relevant for its use in the UEABS.
### Standard CPU version
For the UEABS activity we have used mainly version v6.0 but later versions are now available.
......@@ -21,7 +17,7 @@ completely re-written in CUDA FORTRAN by Filippo Spiga. The version program used
experiments is v6.0, even though further versions becamse available later during the
activity.
## 2. Installation and requirements
## Installation and requirements
### Standard
The Quantum Espresso source can be downloaded from the projects GitHub repository,[QE](https://github.com/QEF/q-e/tags). Requirements can be found from the website but you will need a good FORTRAN and C compiler with an MPI library and optionally (but highly recommended) an optimised linear algebra library.
......@@ -44,7 +40,7 @@ Optional
with the distribution.
##3. Downloading the software
## Downloading the software
### Standard
From the website, for example:
......@@ -59,7 +55,7 @@ to download the software:
git clone https://github.com/fspiga/qe-gpu.git
```
### 4. Compiling and installing the application
## Compiling and installing the application
### Standard installation
Installation is achieved by the usual ```configure, make, make install ``` procedure.
......@@ -96,8 +92,7 @@ make pw
The QE-GPU executable will appear in the directory `GPU/PW` and is called `pw-gpu.x`.
##
Running the program - general procedure
## Running the program - general procedure
Of course you need some input before you can run calculations. The
input files are of two types:
......@@ -133,7 +128,7 @@ batch system.
### Parallelisation options
Quantum Espresso uses various levels of parallelisation, the most important being MPI parallelisation
over the k points available in the input system. This is achieved with the ```-npool``` program option.
over the *k points* available in the input system. This is achieved with the ```-npool``` program option.
Thus for the AUSURF input which has 2 k points we can run:
```bash
srun -n 64 pw.x -npool 2 -input pw.in
......@@ -148,14 +143,19 @@ but we may wish to use fewer pools but with more tasks per pool:
```bash
srun -n 52 pw.x -npool 13 -input pw.in
```
#### Use of ndiag
It is also possible to control the number of MPI tasks used in the diagonalization of the
subspace Hamiltonian. This is possible with the ```-ndiag``` parameter which must be a square number.
For example with the AUSURF input with k-points we can assign 4 processes for the Hamiltonian diagonisation:
```bash
srun -n 64 pw.x -npool 2 -ndiag 4 -input pw.in
```
### Hints for running the GPU version
#### Memory
#### Memory limitations
The GPU port of Quantum Espresso runs almost entirely in the GPU memory. This means that jobs are restricted
by the memory of the GPU device, normally 16-32 GB, regardless of the main node memory. Thus, unless many nodes are used the user is likely to see job failures due to lack of memory, even for small datasets.
For example, on the CSCS Piz Daint supercomputer each node has only 1 NVIDIA Tesla P100 (16GB) which means that you will need at least 4 nodes to run even the smallest dataset (AUSURF in the UEABS).
......@@ -224,8 +224,13 @@ Please check the Cineca documentation for information on using the
[Marconi KNL partition]
(https://wiki.u-gov.it/confluence/display/SCAIUS/UG3.1%3A+MARCONI+UserGuide#UG3.1:MARCONIUserGuide-SystemArchitecture).
## UEABS test cases
| UEABS name | QE name | Description | k-points | Notes|
|------------|---------------|-------------|----------|------|
| Small test case | AUSURF | 112 atoms | 2 | < 4-8 nodes on most systems |
| Medium test case | TA2O5 | Tantalum oxide| 26| Medium scaling, often 20 nodes |
| Large test case | CNT | Carbon nanotube | | Large scaling runs only. Memory and time requirements high|
## 7. References
1. QE-GPU build and download instructions, https://github.com/QEF/qe-gpu-plugin.
Last updated: 7-April-2017
Last updated: 14-January-2019
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment