README.md 12.8 KB
Newer Older
1
# GPAW - A Projected Augmented Wave code
2

3
## Summary version
4

5
1.0
6
7

## Purpose of the benchmark
8
9
10
11
12
13
14

[GPAW](https://wiki.fysik.dtu.dk/gpaw/) is a density-functional theory (DFT)
program for ab initio electronic structure calculations using the projector
augmented wave method. It uses a uniform real-space grid representation of the
electronic wavefunctions that allows for excellent computational scalability
and systematic converge properties.

15
16
The GPAW benchmark tests MPI parallelization and the quality of the provided mathematical
libraries, including BLAS, LAPACK, ScaLAPACK, and FFTW-compatible library. There is
17
18
19
20
21
also a CUDA-based implementation for GPU systems.


## Characteristics of the benchmark

22
23
24
GPAW is written mostly in Python, but includes also computational kernels
written in C as well as leveraging external libraries such as NumPy, BLAS and
ScaLAPACK. Parallelisation is based on message-passing using MPI with no
25
support for multithreading.
26
27
28

There have been various developments for GPGPUs and MICs in the past
using either CUDA or pyMIC/libxstream. Many of those branches see no
29
30
31
32
33
34
35
development anymore. The relevant CUDA version for this benchmark is available in a
[separate GitLab for CUDA development, cuda branch](https://gitlab.com/mlouhivu/gpaw/tree/cuda).

There is currently no active support for non-CUDA accelerator platforms.

For the UEABS benchmark version 2.2, the following versions of GPAW were tested:
  * CPU-based:
36
37
38
      * Version 20.1.0, as this one is the version on which the most recent GPU commits
        are based.
      * Version 20.10.0, as it was the most recent version during the development of
39
        the UEABS 2.2 benchmark suite.
40
41
42
43
44
45
46
47
48
  * GPU-based: As there is no official release of the GPU version and as it is
    at the moment of the release of the UEABS version 2.2 under heavy development
    to also support AMD GPUs, there is no official support for the GPU version
    ([the cuda branch of the GitLab for CUDA development](https://gitlab.com/mlouhivu/gpaw/tree/cuda))
    in UEABS version 2.2.

Versions 1.5.2 and 19.8.1 were also considered but are not compatible with the regular
input files provided here. Hence support for those versions of GPAW was dropped in
this version of the UEABS.
49
50
51

There are three benchmark cases, denotes S, M and L.

52

53
54
55
56
57
58
59
60
### Case S: Carbon nanotube

A ground state calculation for a carbon nanotube in vacuum. By default uses a
6-6-10 nanotube with 240 atoms (freely adjustable) and serial LAPACK with an
option to use ScaLAPACK. Expected to scale up to 10 nodes and/or 100 MPI
tasks. This benchmark runs fast. Expect execution times around 1 minutes
on 100 cores of a modern x86 cluster.

61
Input file: [benchmark/1_S_carbon-nanotube/input.py](benchmark/1_S_carbon-nanotube/input.py)
62

63
64
65
This input file still works with version 1.5.2 and 19.8.1 of GPAW.


66
### Case M: Copper filament
67

68
A ground state calculation for a copper filament in vacuum. By default uses a
Kurt Lust's avatar
Kurt Lust committed
69
70
3x4x4 FCC lattice with 71 atoms (freely adjustable through the variables `x`,
`y` and `z` in the input file) and ScaLAPACK for
71
parallelisation. Expected to scale up to 100 nodes and/or 1000 MPI tasks.
72

Kurt Lust's avatar
Kurt Lust committed
73
74
Input file: [benchmark/2_M_copper-filament/input.py](benchmark/2_M_copper-filament/input.py)

75
76
77
78
79
80
81
This input file does not work with GPAW 1.5.2 and 19.8.1. It requires GPAW
20.1.0 or 20.10.0. Please try older versions of the UEABS if you want to use
these versions of GPAW.

The benchmark runs best when using full nodes. Expect a
performance drop on other configurations.

82
83
84
85
86
87
88
89

### Case L: Silicon cluster

A ground state calculation for a silicon cluster in vacuum. By default the
cluster has a radius of 15Å (freely adjustable) and consists of 702 atoms,
and ScaLAPACK is used for parallelisation. Expected to scale up to 1000 nodes
and/or 10000 MPI tasks.

Kurt Lust's avatar
Kurt Lust committed
90
Input file: [benchmark/3_L_silicon-cluster/input.py](benchmark/3_L_silicon-cluster/input.py)
91

92
93
94
95
96
This input file does not work with GPAW 1.5.2 and 19.8.1. It requires GPAW
20.1.0 or 20.10.0. Please try older versions of the UEABS if you want to use
these versions of GPAW.


97
98

## Mechanics of building the benchmark
99

100
101
102
Installing and running GPAW has changed a lot in the since the previous
versions of the UEABS. GPAW version numbering changed in 2019. Version 1.5.3 is the
last version with the old numbering. In 2019 the development team switched
103
to a version numbering scheme based on year, month and patch level, e.g.,
104
19.8.1 for the second version released in August 2019.
105

Kurt Lust's avatar
Kurt Lust committed
106
107
108
109
Another change is in the Python packages used to install GPAW. Versions up to
and including 19.8.1 use the `distutils` package while versions 20.1.0 and later
are based on `setuptools`. This does affect the installation process.

110
111
112
113
114
Running GPAW is no longer done via a wrapper executable `gpaw-python` that
replaces the Python interpreter (it internally links to the libpython library)
and that provides the MPI functionality. Since version 20.1.0, the standard Python
interpreter is used and the MPI functionality is included in the `_gpaw.so` shared library.

Kurt Lust's avatar
Kurt Lust committed
115
116
117
118

### Available instructions

The [GPAW wiki](https://wiki.fysik.dtu.dk/gpaw/) only contains the
119
[installation instructions](https://wiki.fysik.dtu.dk/gpaw/install.html) for the current version.
120
For the installation instructions with a list of dependencies for older versions,
Kurt Lust's avatar
Kurt Lust committed
121
122
123
124
125
126
download the code (see below) and look for the file `doc/install.rst` or go to the
[GPAW GitLab](https://gitlab.com/gpaw), select the tag for the desired version and
view the file `doc/install.rst`.

The [GPAW wiki](https://wiki.fysik.dtu.dk/gpaw/) also provides some
[platform specific examples](https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html).
127
128
129
130


### List of dependencies

Kurt Lust's avatar
Kurt Lust committed
131
GPAW is Python code but it also contains some C code for some performance-critical
132
133
134
135
parts and to interface to a number of libraries on which it depends.

Hence GPAW has the following requirements:
  * C compiler with MPI support
136
137
  * BLAS, LAPACK, BLACS and ScaLAPACK. ScaLAPACK is optional for GPAW, but mandatory
    for the UEABS benchmarks. It is used by the medium and large cases and optional
138
    for the small case.
139
  * Python. GPAW 20.1.0 requires Python 3.5-3.8 and GPAW 20.10.0 Python 3.6-3.9.
140
  * Mandatory Python packages:
141
142
143
144
145
146
      * [NumPY](https://pypi.org/project/numpy/) 1.9 or later (for GPAW 20.1.0/20.10.0)
        GPAW versions before 20.10.0 produce warnings when used with NumPy 1.19.x.
      * [SciPy](https://pypi.org/project/scipy/) 0.14 or later (for GPAW 20.1.0/20.10.0)
  * [FFTW](http://www.fftw.org) is highly recommended. As long as the optional libvdwxc
    component is not used, the MKL FFTW wrappers can also be used. Recent versions of
    GPAW also show good performance using just the NumPy-provided FFT routines provided
147
    that NumPy has been built with a highly optimized FFT library.
148
149
150
151
  * [LibXC](https://www.tddft.org/programs/libxc/) 3.X or 4.X for GPAW 20.1.0 and 20.10.0.
    LibXC is a library of exchange-correlation functions for density-functional theory.
    None of the versions currently mentions LibXC 5.X as officially supported.
  * [ASE, Atomic Simulation Environment](https://wiki.fysik.dtu.dk/ase/), a Python package
152
153
    from the same group that develops GPAW
      * Check the release notes of GPAW as the releases of ASE and GPAW should match.
154
155
156
        The benchmarks were tested using ASE 3.19.3 with GPAW 20.1.0 and ASE 3.20.1
        with GPAW 20.1.0.
      * ASE has some optional dependencies that are not needed for the benchmarking: Matplotlib (2.0.0 or newer),
157
158
        tkinter (Tk interface, part of the Standard Python Library) and Flask.
  * Optional components of GPAW that are not used by the UEABS benchmarks:
159
      * [libvdwxc](https://gitlab.com/libvdwxc/libvdwxc), a portable C library
160
161
        of density functionals with van der Waals interactions for density functional theory.
        This library does not work with the MKL FFTW wrappers.
162
      * [ELPA](https://elpa.mpcdf.mpg.de/),
163
        which should improve performance for large systems when GPAW is used in
164
        [LCAO mode](https://wiki.fysik.dtu.dk/gpaw/documentation/lcao/lcao.html)
165
166

In addition, the GPU version needs:
Kurt Lust's avatar
Kurt Lust committed
167
  * NVIDIA CUDA toolkit
168
169
  * [PyCUDA](https://pypi.org/project/pycuda/)

Kurt Lust's avatar
Kurt Lust committed
170
Installing GPAW also requires a number of standard build tools on the system, including
171
  * [GNU autoconf](https://www.gnu.org/software/autoconf/) is needed to generate the
172
    configure script for LibXC
173
  * [GNU Libtool](https://www.gnu.org/software/libtool/) is needed. If not found,
174
    the configure process of LibXC produces very misleading
Kurt Lust's avatar
Kurt Lust committed
175
176
177
    error messages that do not immediately point to libtool missing.
  * [GNU make](https://www.gnu.org/software/make/)

178
179

### Download of GPAW
180

181
GPAW is freely available under the GPL license.
182
183
184

The source code of the CPU version can be
downloaded from the [GitLab repository](https://gitlab.com/gpaw/gpaw) or as
Martti Louhivuori's avatar
Martti Louhivuori committed
185
186
a tar package for each release from [PyPi](https://pypi.org/simple/gpaw/).

187
For example, to get version 20.1.0 using git:
Martti Louhivuori's avatar
Martti Louhivuori committed
188
```bash
189
git clone -b 20.1.0 https://gitlab.com/gpaw/gpaw.git
190
191
```

192
The CUDA development version is available in
193
194
195
196
[the cuda branch of a separate GitLab](https://gitlab.com/mlouhivu/gpaw/tree/cuda).
To get the current development version using git:
```bash
git clone -b cuda https://gitlab.com/mlouhivu/gpaw.git
Martti Louhivuori's avatar
Martti Louhivuori committed
197
```
198
199
200
201


### Install

202
203
204
Crucial for the configuration of GPAW is a proper `siteconfig.py` file (GPAW 20.1.0 and later,
earlier versions used `customize.py`). The defaults used by GPAW
may not offer optimal performance and the automatic detection of the libraries also
205
fails on some systems.
206
207

The UEABS repository contains additional instructions:
208
  * [general instructions](build/build-CPU.md)
209

210
Example [build scripts](build/examples/) are also available.
Martti Louhivuori's avatar
Martti Louhivuori committed
211

212

213
## Mechanics of Running the Benchmark
214

215
### Download of the benchmark sets
216

217
218
As each benchmark has only a single input file, these can be downloaded
right from this repository.
219

220
221
  1. [Testcase S: Carbon nanotube input file](benchmark/1_S_carbon-nanotube/input.py)
  2. [Testcase M: Copper filament input file](benchmark/2_M_copper-filament/input.py)
222
  3. [Testcase L: Silicon cluster input file](benchmark/3_L_silicon-cluster/input.py)
223
224


225
### Running the benchmarks
226

227
These instructions are exclusively for GPAW 20.1.0 and later.
228

229
There are two different ways to start GPAW.
230

231
One way is through `mpirun`, `srun` or an equivalent process starter and the
232
233
234
235
`gpaw python` command:
```
srun gpaw python input.py
```
236

237
238
239
240
241
242
The second way is by simply using the `-P` flag of the `gpaw` command and let it
use a process starter internally:
```
gpaw -P 100 python input.py
```
will run on 100 cores.
243

244
245
246
247
248
249
There is a third but non-recommended option:
```
srun python3 input.py
```
That option however doesn't do the imports in the same way that the `gpaw` script
would do.
250
251


252
## Verification of Results
253

254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
The results of the benchmarks can be verified with the following piece of code:
```bash
bmtime=$(grep "Total:" output.txt | sed -e 's/Total: *//' | cut -d " " -f 1)
iterations=$(grep "Converged after" output.txt | cut -d " " -f 3)
dipole=$(grep "Dipole" output.txt | cut -d " " -f 5 | sed -e 's/)//')
fermi=$(grep "Fermi level:" output.txt | cut -d ":" -f 2 | sed -e 's/ //g')
energy=$(grep "Extrapolated: " output.txt | cut -d ":" -f 2 | sed -e 's/ //g')
echo -e "\nResult information:\n" \
    " * Time:                   $bmtime s\n" \
    " * Number of iterations:   $iterations\n" \
    " * Dipole (3rd component): $dipole\n" \
    " * Fermi level:            $fermi\n" \
    " * Extrapolated energy:    $energy\n"
```

Kurt Lust's avatar
Kurt Lust committed
269
270
271
272
273
The time is the time measured by GPAW itself for the main part of the computations
and is what is used as the benchmark result.

The other numbers can serve as verification of the results and were obtained with
GPAW 20.1.0, 20.10.0 and 21.1.0.
274

275
### Case S: Carbon nanotube
276

277
278
The expected values are:
  * Number of iterations: 12
279
280
281
  * Dipole (3rd component): Between -115.17 and -115.16
  * Fermi level: Between -4.613 and -4.611
  * Extrapolated energy: Between -2397.63 and -2397.62
282

283

284
285
### Case M: Copper filament

286
287
The expected values are:
  * Number of iterations: 19
288
289
290
  * Dipole (3rd component): Between -80.51 and -80.50
  * Fermi level: Between -4.209 and -4.207
  * Extrapolated energy: Between -473.5 and -473.3
291

292
293
294

### Case L: Silicon cluster

295
296
297
With this test case, some of the results differ between version 20.1.0 and 20.10.0
on one hand and version 21.1.0 on the other hand.

298
The expected values are:
299
  * Number of iterations: Between 30 and 35
300
301
  * Dipole (3rd component): Between -0.493 and -0.491
  * Fermi level: Between -2.67 and -2.66
302
303
304
305
306
307
308
309
  * Extrapolated energy: Between -3784 and -3783

Note: Though not used for the benchmarking in the final report, some testing was done
with version 21.1.0 also. In this version, some external library routines were replaced
by new internal implementations that cause changes in some results. For 21.1.0, the
expected values are:

  * Number of iterations: Between 30 and 35
310
311
  * Dipole (3rd component): Between -0.462 and -0.461
  * Fermi level: Between -2.59 and -2.58
312
  * Extrapolated energy: Between -3784 and -3783
313