Skip to content
build-CPU.md 14.1 KiB
Newer Older
# Detailed GPAW installation instructions on non-acclerated systems

These instructions are in addition to the brief instructions in [README.md](../README.md).

They cover versions 20.1.0, 20.10.0 and 21.1.0 of GPAW. The distutils-based versions
prior to 20.1.0 are no longer supported by the benchmark suite.

## Detailed dependency list

### Libraries and Python interpreter

GPAW needs (for the UEABS benchmarks)
  * [Python](https://www.python.org/): GPAW 20.1.0 requires Python 3.5-3.8, and
    GPAW 20.10.0 and 21.1.0 require Python 3.6-3.9.
  * [MPI library](https://www.mpi-forum.org/)
  * [LibXC](https://www.tddft.org/programs/libxc/). GPAW 20.1.0,
    20.10.0 and 21.1.0 all need LibXC 3.x or 4.x.
  * (Optimized) [BLAS](http://www.netlib.org/blas/) and
    [LAPACK](http://www.netlib.org/lapack/) libraries.
    There are both commercial and free and open source versions of these libraries.
    Using the [reference implementation of BLAS from netlib](http://www.netlib.org/blas/)
    will give very poor performance. Most optimized LAPACK libraries actually only
    optimize a few critical routines while the remaining routines are compiled from
    the reference version. Most processor vendors for HPC machines and system vendors
    offer optmized versions of these libraries.
  * [ScaLAPACK](http://www.netlib.org/scalapack/) and the underlying communication
    layer [BLACS](http://www.netlib.org/blacs/).
  * [FFTW](http://www.fftw.org/) or compatible FFT library.
    For the UEABS benchmarks, the double precision, non-MPI version is sufficient.
    GPAW also works with the
    [Intel MKL](https://software.intel.com/content/www/us/en/develop/tools/math-kernel-library.html)
    FFT routines when using the FFTW wrappers provided with that product.

For the GPU version, the following packages are needed in addition to the packages
above:
  * CUDA toolkit
  * [PyCUDA](https://pypi.org/project/pycuda/)

Optional components of GPAW that are not used by the UEABS benchmarks:
  * [libvdwxc](https://gitlab.com/libvdwxc/libvdwxc), a portable C library
    of density functionals with van der Waals interactions for density functional theory.
    This library does not work with the MKL FFTW wrappers as it needs the MPI version
    of the FFTW libraries too.
  * [ELPA](https://elpa.mpcdf.mpg.de/),
    which should improve performance for large systems when GPAW is used in
    [LCAO mode](https://wiki.fysik.dtu.dk/gpaw/documentation/lcao/lcao.html)


### Python packages

GPAW needs
  * [wheel](https://pypi.org/project/wheel/) is needed in most (if not all) ways of
    installing the packages from source.
  * [NumPy](https://pypi.org/project/numpy/) 1.9 or later (for GPAW 20.1.0/20.10.0/21.1.0)
      * Installing NumPy from source will also require
        [Cython](https://pypi.org/project/Cython/)
      * GPAW 20.1.0 is not fully compatible with NumPy 1.19.x or later. Warnings about the use
        of deprecated constructs will be shown.
  * [SciPy](https://pypi.org/project/scipy/) 0.14 or later (for GPAW 20.1.0/20.10.0/21.1.0)
  * [ASE, Atomic Simulation Environment](https://wiki.fysik.dtu.dk/ase/), a Python package
    from the same group that develops GPAW. The required versions is 3.18.0 or later for
    GPAW 20.1.0, 20.10.0 and 21.1.0.
    ASE has a couple of dependendencies
    that are not needed for running the UEABS benchmarks. However, several Python
    package install methods will trigger the installation of those packages, and
    with them may require a chain of system libraries.
      * ASE does need NumPy and SciPy, but these are needed anyway for GPAW.
      * [matplotlib](https://pypi.org/project/matplotlib/), at least version 2.0.0.
        This package is optional and not really needed to run the benchmarks.
        Matplotlib pulls in a lot of other dependencies. When installing ASE with pip,
        it will try to pull in matplotlib and its dependencies
          * [pillow](https://pypi.org/project/Pillow/) needs several exgternal
            libraries. During the development of the benchmarks, we needed at least
            zlib, libjpeg-turbo (or compatible libjpeg library) and freetype. Even
            though the pillow documentation claimed that libjpeg was optional,
            it refused to install without.
          * [kiwisolver](https://pypi.org/project/kiwisolver/): Contains C++-code
          * [pyparsing](https://pypi.org/project/pyparsing/)
          * [Cycler](https://pypi.org/project/Cycler/), which requires
              * [six](https://pypi.org/project/six/)
          * [python-dateutil](https://pypi.org/project/python-dateutil/), which also
            requires
              * [six](https://pypi.org/project/six/)
      * [Flask](https://pypi.org/project/Flask/) is an optional dependency of ASE
        that is not automatically pulled in by `pip` in versions of ASE tested during
        the development of this version of the UEABS. It has a number of dependencies
        too:
          * [Jinja2](https://pypi.org/project/Jinja2/)
              * [MarkupSafe](https://pypi.org/project/MarkupSafe/), contains some C
                code
          * [itsdangerous](https://pypi.org/project/itsdangerous/)
          * [Werkzeug](https://pypi.org/project/Werkzeug/)
          * [click]()


## Tested configurations

  * Python
  * Libraries used during the installation of Python:
      * ncurses 6.2
      * libreadline 8.0, as it makes life easy when using the command line
        interface of Python (and in case of an EasyBuild Python, because EasyBuild
        requires it)
      * libffi 3.3
      * zlib 1.2.11
      * OpenSSL 1.1.1g, but only when EasyBuild was used and requires it.
      * SQLite 3.33.0, as one of the tests in some versions of GPAW requires it to
        succeed.
      * Python will of course pick up several other libraries that it might find on
        the system. The benchmark installation was tested on a system with very few
        development packages of libraries installed in the system image. Tcl/Tk
        and SQLite3 development packages in particular where not installed, so the
        standard Python library packages sqlite3 and tkinter were not fully functional.
  * Python packages
      * wheel
      * Cython
      * NumPy
      * SciPy
      * ASE
      * GPAW

The table below give the combinations of major packages Python, NumPy, SciPy, ASE and
GPAW that were tested:

| GPAW    | ASE     | Python | NumPy  | SciPy |
|:--------|:--------|:-------|:-------|:------|
| 20.1.0  | 3.19.3  | 3.8.7  | 1.18.5 | 1.5.4 |
| 20.10.0 | 3.20.1  | 3.9.4  | 1.19.5 | 1.5.4 |

Note: On some systems compiling SciPy 1.5.4 with NumPy 1.19.5 produced errors. On those
systems NumPy 1.18.5 was used.

Other configurations that were only tested on a limited number of clusters:

| GPAW    | ASE     | Python | NumPy  | SciPy |
|:--------|:--------|:-------|:-------|:------|
| 21.1.0  | 3.21.1  | 3.9.4  | 1.19.5 | 1.5.4 |


## Installing all prerequisites

We do not include the optimized mathematical libraries in the instructions (BLAS, LAPACK,
FFT library, ...) as these libraries should be standard on any optimized HPC system.
Also, the instructions below will need to be adapted to the specific
libraries that are being used.

Other prerequisites:
  * libxc
  * Python interpreter
  * Python package NumPy
  * Python package SciPy
  * Python package ase


### Installing libxc

  * Installing libxc requires GNU automake and GNU buildtool besides GNU make and a
    C compiler. The build process is the usual GNU configure - make - make install
    cycle, but the `configure` script still needs to be generated with autoreconf.
  * Download libxc:
      * The latest version of libxc can be downloaded from
        [the libxc download page](https://www.tddft.org/programs/libxc/download/).
        However, that version may not be officially supported by GPAW.
      * It is also possible to download all recent versions of libxc from
        [the libxc GitLab](https://gitlab.com/libxc/libxc)
          * Select the tag corresponding to the version you want to download in the
            branch/tag selection box.
          * Then use the download button and select the desired file type.
          * Dowload URLs look like `https://gitlab.com/libxc/libxc/-/archive/4.3.4/libxc-4.3.4.tar.bz2`.
  * Untar the file in the build directory.
### Installing Python from scratch

The easiest way to get Python on your system is to download an existing distribution
(one will likely already be installed on your system). Python itself does have a lot
of dependencies though, definitely in its Standard Python Library. Many of the
standard packages are never needed when executing the benchmark cases. Isolating them
to compile a Python with minimal dependencies is beyond the scope though. We did
compile Python without the necessary libraries for the standard libraries sqlite3
and tkinter (the latter needing Tcl/Tk).

Even though GPAW contains a lot of Python code, the Python interpreter is not the main
performance-determining factor in the GPAW benchmark. Having a properly optimized installation
of NumPy, SciPy and GPAW itself proves much more important.


### Installing NumPy

  * As NumPy relies on optimized libraries for its performance, one should carefully
    select which NumPy package to download, or install NumPy from sources. How crucial
    this is, depends on the version of GPAW and the options selected when building
  * Given that GPAW also uses optimized libraries, it is generally advised to install
    NumPy from sources instead to ensure that the same libraries are used as will be
    used for GPAW to prevent conflicts between libraries that might otherwise occur.
  * In most cases, NumPy will need a `site.cfg` file to point to the optimized libraries.
    See the examples for various systems and the file `site.cfg.example` included in
    the NumPy sources.


### Installing SciPy

  * Just as NumPy, SciPy relies on optimized libraries for its performance. It should
    be installed after NumPy as it does get the information about which libraries to
    use from NumPy. Hence, when installing pre-built binaries, make sure they match
    the NumPy binaries used.
  * Just as is the case for NumPy, it may be better to install SciPy from sources.
    [Instructions for installing SciPy from source can be found on the SciPy GitHub
    site](https://github.com/scipy/scipy/blob/master/INSTALL.rst.txt).


### Installing ase

  * Just as for any user-installed Python package, make sure you have created a
    directory to install Python packages to and have added it to the front of PYTHONPATH.
  * ase is [available on PyPi](https://pypi.org/project/ase/). It is also possible
    to [see a list of previous releases](https://pypi.org/project/ase/#history).
  * The easiest way to install ase is using `pip` which will automatically download.
    the requested version.


## Configuring and installing GPAW

  * GPAW 20.1.0 and later use `setuptools`. Customization of the installation process is possible
    through the `siteconfig.py` file.
  * It is possible to specify the FFT library in `siteconfrig.py or to simply select to
    use the NumPy FFT routines.
  * GPAW also needs a number of so-called "Atomic PAW Setup" files. The latest files
    can be found on the [GPAW website, Atomic PAW Setups page](https://wiki.fysik.dtu.dk/gpaw/setups/setups.html).
    For the testing we used []`gpaw-setups-0.9.20000.tar.gz`](https://wiki.fysik.dtu.dk/gpaw-files/gpaw-setups-0.9.20000.tar.gz)
    for all versions of GPAW. The easiest way to install these files is to simpy untar
    the file and set the environment variable GPAW_SETUP_PATH to point to that directory.
    In the examples provided we use the `share/gpaw-setups` subdirectory of the install
    directory for this purpose.
  * GPAW 20.1.0 comes with a test suite which can be used after installation.
      * Running the sequential tests:

            gpaw test

        Help is available through

            gpaw test -h

      * Running those tests, but using multiple cores (e.g., 4):

            gpaw test -j 4

        We did experience that crashed that cause segmentation faults get unnoticed
        in this setup. They are not mentioned as failed.

      * Running the parallel benchmarks on a SLURM cluster will depend on the version of GPAW.

          * Versions that build the parallel interpreter (19.8.1 and older):

                srun -n 4 gpaw-python -m gpaw test

          * Versions with the parallel so library using the regular Python interpreter (20.1.0 and above):

                srun -n 4 python -m gpaw test


      * Depending on the Python installation, some tests may fail with error messages that point
        to a package in the Standard Python Library that is not present. Some of these errors have no
        influence on the benchmarks as that part of the code is not triggered by the benchmark.
  * The full test suite is missing in GPAW 20.10.0 and later. There is a brief sequential test
    that can be run with

        gpaw test

    and a parallel one that can be run with

        gpaw -P 4 test

  * Multiple versions of GPAW likely contain a bug in `c/bmgs/fd.c` (around line 44
    in GPAW 20.1.0). The code enforces vectorization on OpenMP 4 compilers by using
    `#pragma omp simd`. However, it turns out that the data is not always correctly
    aligned, so if the reaction of the compiler to `#pragma omp simd` is to fully vectorize
    and use load/store instructions for aligned data, crashes may occur. It did happen
    during the benchmark development when compiling with the Intel C compiler. The
    solution for that compiler is to add `-qno-openmp-simd` to the compiler flags.


## Problems observed during testing
  * On AMD Epyc systems, there seems to be a bug in the Intel MKL FFT libraries/FFTW
    wrappers in the 2020 compilers. Downgrading to the MKL libraries of the 2018
    compilers or using the FFTW libraries solves the problem.
    This has been observed not only in GPAW, but also in some other DFT packages.
  * The GPAW test code in version  20.1.0 detects if matplotlib is not installed
    and will skip this test. We did however observe a failed test when Python could not find
    the SQLite package as the Python standard library sqlite3 package is used.