Commit 64814b29 authored by Kurt Lust's avatar Kurt Lust
Browse files

Updated the GPAW README, more along the lines of the CORAL equivalents.

parent 6cff3585
Preparation PRACE Benchmarks for GPAW
=====================================
# GPAW - A Projected Augmented Wave code
GPAW
----
## Summary version
### Code description
0.1
## Purpose of the benchmark
[GPAW](https://wiki.fysik.dtu.dk/gpaw/) is a density-functional theory (DFT)
program for ab initio electronic structure calculations using the projector
......@@ -12,33 +12,127 @@ augmented wave method. It uses a uniform real-space grid representation of the
electronic wavefunctions that allows for excellent computational scalability
and systematic converge properties.
The GPAW benchmark tests MPI parallelization and the quality of the provided mathematical
libraries, including BLAS, LAPACK, ScaLAPACK, and FFTW-compatible library. There is
also a CUDA-based implementation for GPU systems.
## Characteristics of the benchmark
GPAW is written mostly in Python, but includes also computational kernels
written in C as well as leveraging external libraries such as NumPy, BLAS and
ScaLAPACK. Parallelisation is based on message-passing using MPI with no
support for multithreading.
Note that GPAW version numbering changed in 2019. Version 1.5.3 is the
last version with the old numbering. In 2019 the development team switched
to a version numbering scheme based on year, month and patchlevel, e.g.,
19.8.1 for the second version released in August 2019.
There have been various developments for GPGPUs and MICs in the past
using either CUDA or pyMIC/libxstream. Many of those branches see no
development anymore and the information on the
[GPU page of the GPAW Wiki](https://wiki.fysik.dtu.dk/gpaw/devel/projects/gpu.html)
is also outdated.
development anymore. The relevant CUDA version for this benchmark is available in a
[separate GitLab for CUDA development, cuda branch](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
This version corresponds to the Aalto version mentioned on the
[GPU page of the GPAW Wiki](https://wiki.fysik.dtu.dk/gpaw/devel/projects/gpu.html).
As of early 2020, that version seems to be derived from the 1.5.2 CPU version
(at least, I could find a commit that claims to merge the 1.5.2 code).
There is currently no active support for non-CUDA accelerator platforms.
For the UEABS benchmark version 2.2, the following versions of GPAW were tested:
* CPU-based:
* Version 1.5.3 as this one is the last of the 1.5 branch and since the GPU version
is derived from 1.5.2.
* Version 20.1.0, the most recent version during the development of the UEABS
2.2 benchmark suite.
* GPU-based: There is no official release or version number. The UEABS 2.2 benchmark
suite was tested using commit TODO of
[the cuda branch of the GitLab for CUDA development](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
There are three benchmark cases, denotes S, M and L.
### Case S: Carbon nanotube
A ground state calculation for a carbon nanotube in vacuum. By default uses a
6-6-10 nanotube with 240 atoms (freely adjustable) and serial LAPACK with an
option to use ScaLAPACK. Expected to scale up to 10 nodes and/or 100 MPI
tasks. This benchmark runs fast. Expect execution times around 1 minutes
on 100 cores of a modern x86 cluster.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/carbon-nanotube/input.py)
### Case M: Copper filament
* Outdated branches in the [GitLab repository](https://gitlab.com/gpaw/gpaw)
include the branch [rpa-gpu-expt](https://gitlab.com/gpaw/gpaw/tree/rpa-gpu-expt)
and [cuda](https://gitlab.com/gpaw/gpaw/tree/cuda).
* The current CUDA development version is available in a
[separate GitLab for CUDA development, cuda branch](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
This version should correspond to the Aalto version mentioned on the GPAW Wiki.
As of early 2020, that version seems to be derived from the 1.5.2 CPU version
(at least, I could find a commit that claims to merge the 1.5.2 code).
A ground state calculation for a copper filament in vacuum. By default uses a
2x2x3 FCC lattice with 71 atoms (freely adjustable) and ScaLAPACK for
parallelisation. Expected to scale up to 100 nodes and/or 1000 MPI tasks.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/copper-filament/input.py)
### Case L: Silicon cluster
A ground state calculation for a silicon cluster in vacuum. By default the
cluster has a radius of 15Å (freely adjustable) and consists of 702 atoms,
and ScaLAPACK is used for parallelisation. Expected to scale up to 1000 nodes
and/or 10000 MPI tasks.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/silicon-cluster/input.py)
## Mechanics of building the benchmark
Note that GPAW version numbering changed in 2019. Version 1.5.3 is the
last version with the old numbering. In 2019 the development team switched
to a version numbering scheme based on year, month and patchlevel, e.g.,
19.8.1 for the second version released in August 2019.
### Download
A further major change affecting both the build process and the mechanics of running
the benchmark happened in version 20.1.0. Versions up to and including 19.8.1 use a
wrapper executable `gpaw-python` that replaces the Python interpreter (it internally
links to the libpython library) and provides the MPI functionality. From version 20.1.0
the standard Python interpreter is used and the MPI functionality is included in the `_gpaw.so`
shared library, though there is still an option in the build process (not tested for
the UEABS benchmarks) to generate that wrapper instead.
### List of dependencies
GPAW is Python code (3.5 or newer) but it also contains some C code for some performance-critical
parts and to interface to a number of libraries on which it depends.
Hence GPAW has the following requirements:
* C compiler with MPI support
* BLAS, LAPACK, BLACS and ScaLAPACK. ScaLAPACK is optional for GPAW, but mandatory
for the UEABS benchmarks. It is used by the medium and large cases and optional
for the small case.
* Python 3.5 or newer
* Mandatory Python packages:
* [NumPY](https://pypi.org/project/numpy/) 1.9 or later (for GPAW 19.8.1/20.1.0)
* [SciPy](https://pypi.org/project/scipy/) 0.14 or later (for GPAW 19.8.1/20.1.0)
* [FFTW](http://www.fftw.org) is highly recommended. As long as the optional libvdwxc
component is not used, the MKL FFTW wrappers can also be used. Recent versions of
GPAW can even show good performance using just the NumPy-provided FFT routines provided
that NumPy has been built with a highly optimized FFT library.
* [LibXC](https://www.tddft.org/programs/libxc/) 3.X or 4.X. LibXC is a library
of exchange-correlation functions for density-functional theory
* [ASE, Atomic Simulation Environment](https://wiki.fysik.dtu.dk/ase/), a Python package
from the same group that develops GPAW
* Check the release notes of GPAW as the releases of ASE and GPAW should match.
E.g., during the development of the UEABS version 2.2 benchamark suite,
version 20.1.0 was the most up-to-date release of GPAW with 3.19.1 the matching ASE version
(though 3.18.0 should also work).
* ASE has some optional dependencies that are not needed for the benchmarking: Matplotlib (2.0.0 or newer),
tkinter (Tk interface, part of the Standard Python Library) and Flask.
* Optional components of GPAW that are not used by the UEABS benchmarks:
* [libvdwxc](https://gitlab.com/libvdwxc/libvdwxc), a portable C library
of density functionals with van der Waals interactions for density functional theory.
This library does not work with the MKL FFTW wrappers.
* [ELPA](https://elpa.mpcdf.mpg.de/),
which should improve performance for large systems when GPAW is used in
[LCAO mode](https://wiki.fysik.dtu.dk/gpaw/documentation/lcao/lcao.html)
In addition, the GPU version needs:
* CUDA toolkit
* [PyCUDA](https://pypi.org/project/pycuda/)
### Download of GPAW
GPAW is freely available under the GPL license.
......@@ -46,9 +140,9 @@ The source code of the CPU version can be
downloaded from the [GitLab repository](https://gitlab.com/gpaw/gpaw) or as
a tar package for each release from [PyPi](https://pypi.org/simple/gpaw/).
For example, to get version 19.8.1 using git:
For example, to get version 20.1.0 using git:
```bash
git clone -b 19.8.1 https://gitlab.com/gpaw/gpaw.git
git clone -b 20.1.0 https://gitlab.com/gpaw/gpaw.git
```
The CUDA development version is available in
......@@ -64,88 +158,91 @@ git clone -b cuda https://gitlab.com/mlouhivu/gpaw.git
Official generic [installation instructions](https://wiki.fysik.dtu.dk/gpaw/install.html)
and
[platform specific examples](https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html)
are provided in the [GPAW wiki](https://wiki.fysik.dtu.dk/gpaw/).
are provided in the [GPAW wiki](https://wiki.fysik.dtu.dk/gpaw/).
Crucial for the configuration of GPAW is a proper `customize.py` (GPAW 19.8.1 and
earlier) or `siteconfig.py` (GPAW 20.1.0 and later) file. The defaults used by GPAW
may not offer optimal performance and the automatic detection of the libraries also
fails on some systems.
The UEABS repository contains additional instructions:
* [general instructions](installation.md) - Under development
* [GPGPUs](build/build-cuda.md) - To check
* [Xeon Phis (KNC)](build/build-xeon-phi.md) - Outdated
* [general instructions](installation.md) - Under development
* [GPGPUs](build/build-cuda.md) - To check
Example [build scripts](build/examples/) are also available for some PRACE
systems.
Benchmarks
----------
## Mechanics of Running the Benchmark
### Download
### Download of the benchmark sets
The benchmark set is available in the [benchmark/](benchmark/) directory or
alternatively, for download, either directly from the development
[Git repository](https://github.com/mlouhivu/gpaw-benchmarks/tree/prace)
or from the PRACE RI website (http://www.prace-ri.eu/ueabs/).
As each benchmark has only a single input file, these can be downloaded
right from this repository.
To download the benchmarks, use e.g. the following command:
```
git clone -b prace https://github.com/mlouhivu/gpaw-benchmarks
```
1. [Testcase S: Carbon nanotube input file](benchmark/1_S_carbon-nanotube/input.py)
2. [Testcase M: Copper filament input file](benchmark/2_M_copper-filament/input.py)
3. [Testcase L: Silicon cluster input file](benchmark/3_L_silicon-cluster/input.py)
### Benchmark cases
### Running the benchmarks
#### Case S: Carbon nanotube
#### Versions up to an including 19.8.1 of GPAW
A ground state calculation for a carbon nanotube in vacuum. By default uses a
6-6-10 nanotube with 240 atoms (freely adjustable) and serial LAPACK with an
option to use ScaLAPACK. Expected to scale up to 10 nodes and/or 100 MPI
tasks.
These versions of GPAW come with their own wrapper executable, `gpaw-python`,
to start a MPI-based GPAW run.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/carbon-nanotube/input.py)
No special command line options or environment variables are needed to run the
benchmarks if your MPI process starter (`mpirun`, Slurm `srun`, ...) communicates
properly with the resource manager. E.g., on Slurm systems, use
```
srun gpaw-python input.py
```
#### Case M: Copper filament
#### GPAW 20.1.0 (and likely later)
A ground state calculation for a copper filament in vacuum. By default uses a
2x2x3 FCC lattice with 71 atoms (freely adjustable) and ScaLAPACK for
parallelisation. Expected to scale up to 100 nodes and/or 1000 MPI tasks.
The wrapper executable `gpaw-python` is no longer available in the default parallel
build of GPAW. There are now two different ways to start GPAW.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/copper-filament/input.py)
One way is through `mpirun`, `srun` or an equivalent process starter and the
`gpaw python` command:
```
srun gpaw python input.py
```
#### Case L: Silicon cluster
The second way is by simply using the `-P` flag of the `gpaw` command and let it
use a process starter internally:
```
gpaw -P 100 python input.py
```
will run on 100 cores.
A ground state calculation for a silicon cluster in vacuum. By default the
cluster has a radius of 15Å (freely adjustable) and consists of 702 atoms,
and ScaLAPACK is used for parallelisation. Expected to scale up to 1000 nodes
and/or 10000 MPI tasks.
There is a third but non-recommended option:
```
srun python3 input.py
```
That option however doesn't do the imports in the same way that the `gpaw` script
would do.
Input file: [benchmark/carbon-nanotube/input.py](benchmark/silicon-cluster/input.py)
### Examples
### Running the benchmarks
Example [job scripts](scripts/) (`scripts/job-*.sh`) are provided for
different PRACE systems that may offer a helpful starting point.
TODO: Update the examples.
No special command line options or environment variables are needed to run the
benchmarks on most systems. One can simply say e.g.
```
srun gpaw-python input.py
```
#### Special case: KNC
## Verification of Results
For KNCs (Xeon Phi Knights Corner), one needs to use a wrapper script to set
correct affinities for pyMIC (see
[scripts/affinity-wrapper.sh](scripts/affinity-wrapper.sh) for an example)
and to set two environment variables for GPAW:
```shell
GPAW_OFFLOAD=1 # (to turn on offloading)
GPAW_PPN=<no. of MPI tasks per node>
```
### Case S: Carbon nanotube
For example, in a SLURM system, this could be:
```shell
GPAW_PPN=12 GPAW_OFFLOAD=1 mpirun -np 256 -bootstrap slurm \
./affinity-wrapper.sh 12 gpaw-python input.py
```
TODO.
#### Examples
### Case M: Copper filament
TODO.
### Case L: Silicon cluster
TODO.
Example [job scripts](scripts/) (`scripts/job-*.sh`) are provided for
different PRACE systems that may offer a helpful starting point.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment