Preparation PRACE Benchmarks for GPAW
=====================================

GPAW
----

### Code description

[GPAW](https://wiki.fysik.dtu.dk/gpaw/) is a density-functional theory (DFT)
program for ab initio electronic structure calculations using the projector
augmented wave method. It uses a uniform real-space grid representation of the
electronic wavefunctions that allows for excellent computational scalability
and systematic converge properties.

GPAW is written mostly in Python, but includes also computational kernels
written in C as well as leveraging external libraries such as NumPy, BLAS and
ScaLAPACK. Parallelisation is based on message-passing using MPI with no
support for multithreading. 

Note that GPAW version numbering changed in 2019. Version 1.5.3 is the
last version with the old numbering. In 2019 the development team switched 
to a version numbering scheme based on year, month and patchlevel, e.g.,
19.8.1 for the second version released in August 2019.

There have been various developments for GPGPUs and MICs in the past
using either CUDA or pyMIC/libxstream. Many of those branches see no
development anymore and the information on the
[GPU page of the GPAW Wiki](https://wiki.fysik.dtu.dk/gpaw/devel/projects/gpu.html)
is also outdated. 

* Outdated branches in the [GitLab repository](https://gitlab.com/gpaw/gpaw)
  include the branch [rpa-gpu-expt](https://gitlab.com/gpaw/gpaw/tree/rpa-gpu-expt)
  and [cuda](https://gitlab.com/gpaw/gpaw/tree/cuda).
* The current CUDA development version is available in a
  [separate GitLab for CUDA development, cuda branch](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
  This version should correspond to the Aalto version mentioned on the GPAW Wiki.
  As of early 2020, that version seems to be derived from the 1.5.2 CPU version
  (at least, I could find a commit that claims to merge the 1.5.2 code).


### Download

GPAW is freely available under the GPL license. 

The source code of the CPU version can be
downloaded from the [GitLab repository](https://gitlab.com/gpaw/gpaw) or as
a tar package for each release from [PyPi](https://pypi.org/simple/gpaw/).

For example, to get version 19.8.1 using git:
```bash
git clone -b 19.8.1 https://gitlab.com/gpaw/gpaw.git
```

The CUDA development version is available in 
[the cuda branch of a separate GitLab](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
To get the current development version using git:
```bash
git clone -b cuda https://gitlab.com/mlouhivu/gpaw.git
```


### Install

Official generic [installation instructions](https://wiki.fysik.dtu.dk/gpaw/install.html)
and
[platform specific examples](https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html)
are provided in the [GPAW wiki](https://wiki.fysik.dtu.dk/gpaw/). 

The UEABS repository contains additional instructions:
* [general instructions](installation.md) - Under development
* [GPGPUs](build/build-cuda.md) - To check
* [Xeon Phis (KNC)](build/build-xeon-phi.md) - Outdated

Example [build scripts](build/examples/) are also available for some PRACE
systems.


Benchmarks
----------

### Download

The benchmark set is available in the [benchmark/](benchmark/) directory or
alternatively, for download, either directly from the development
[Git repository](https://github.com/mlouhivu/gpaw-benchmarks/tree/prace)
or from the PRACE RI website (http://www.prace-ri.eu/ueabs/).

To download the benchmarks, use e.g. the following command:
```
git clone -b prace https://github.com/mlouhivu/gpaw-benchmarks
```


### Benchmark cases

#### Case S: Carbon nanotube

A ground state calculation for a carbon nanotube in vacuum. By default uses a
6-6-10 nanotube with 240 atoms (freely adjustable) and serial LAPACK with an
option to use ScaLAPACK. Expected to scale up to 10 nodes and/or 100 MPI
tasks.

Input file: [benchmark/carbon-nanotube/input.py](benchmark/carbon-nanotube/input.py)

#### Case M: Copper filament

A ground state calculation for a copper filament in vacuum. By default uses a
2x2x3 FCC lattice with 71 atoms (freely adjustable) and ScaLAPACK for
parallelisation. Expected to scale up to 100 nodes and/or 1000 MPI tasks.

Input file: [benchmark/carbon-nanotube/input.py](benchmark/copper-filament/input.py)

#### Case L: Silicon cluster

A ground state calculation for a silicon cluster in vacuum. By default the
cluster has a radius of 15Å (freely adjustable) and consists of 702 atoms,
and ScaLAPACK is used for parallelisation. Expected to scale up to 1000 nodes
and/or 10000 MPI tasks.

Input file: [benchmark/carbon-nanotube/input.py](benchmark/silicon-cluster/input.py)


### Running the benchmarks

No special command line options or environment variables are needed to run the
benchmarks on most systems. One can simply say e.g.
```
srun gpaw-python input.py
```

#### Special case: KNC

For KNCs (Xeon Phi Knights Corner), one needs to use a wrapper script to set
correct affinities for pyMIC (see
[scripts/affinity-wrapper.sh](scripts/affinity-wrapper.sh) for an example)
and to set two environment variables for GPAW:
```shell
GPAW_OFFLOAD=1  # (to turn on offloading)
GPAW_PPN=<no. of MPI tasks per node>
```

For example, in a SLURM system, this could be:
```shell
GPAW_PPN=12 GPAW_OFFLOAD=1 mpirun -np 256 -bootstrap slurm \
  ./affinity-wrapper.sh 12 gpaw-python input.py
```

#### Examples

Example [job scripts](scripts/) (`scripts/job-*.sh`) are provided for
different PRACE systems that may offer a helpful starting point.