GPGPUs
======

GPAW has a separate CUDA version available for Nvidia GPGPUs. Nvidia has
released multiple versions of the CUDA toolkit. In this work just CUDA 7.5 was
tested with Tesla K20, Tesla K40 and Tesla K80 cards.

Source code is available in GPAW's repository as a separate branch called
'cuda'. To obtain the code, use e.g. the following commands:
```
git clone https://gitlab.com/gpaw/gpaw.git
cd gpaw
git checkout cuda
```
or download it from: https://gitlab.com/gpaw/gpaw/tree/cuda

Alternatively, the source code is also available with minor modifications
required to work with newer versions of CUDA and Libxc (incl. example
installation settings) at:
  https://gitlab.com/atekin/gpaw-cuda.git

Patches needed to work with newer versions of CUDA and example setup scripts
for GPAW using dynamic links to Libxc are available also separately at:
  https://github.com/mlouhivu/gpaw-cuda-patches.git


Software requirements
---------------------

CUDA branch of the GPAW is based on version 0.9.1.13528 and has similar
software requirements to the main branch. To compile the code, one needs to
use Intel compile environment.

For example, the following versions are known to work:
* Intel compile environment with Intel MKL and Intel MPI (2015 update 3)
* Python (2.7.11)
* ASE (3.9.1)
* Libxc (3.0.0)
* CUDA (7.5)
* HDF5 (1.8.16)
* Pycuda (2016.1.2)


Install instructions
--------------------

Before installing the CUDA version of GPAW, the required packages should be
compiled using Intel compilers. In addition to using Intel compilers, there
are two additional steps compared to a standard installation:

1. Compile the CUDA files after preparing a suitable `make.inc` (in `c/cuda/`)
   by modifying the default options and paths to match your system. The
   following compiler options may offer a good starting point.

   ```shell
   CC        = icc
   CCFLAGS   = $(CUGPAW_DEFS) -fPIC -std=c99 -m64 -O3
   NVCC      = nvcc -ccbin=icpc
   NVCCFLAGS = $(CUGPAW_DEFS) -O3 -arch=sm_20 -m64 --compiler-options '-fPIC -O3'
   ```

   To use a dynamic link to Libxc, please add a corresponding include flag to
   the CUGPAW_INCLUDES (e.g. `-I/path/to/libxc/include`) and a

   It is possible that you may also need to add additional include flags for
   MKL and Libxc in CUGPAW_INCLUDES (e.g. `-I/path/to/mkl/include`).

   After making the necessary changes, simply run make (in the `c/cuda`
   path).

2. Edit your GPAW setup script (`customize.py`) to add correct link and
   compile options for CUDA. The relevant lines are e.g.:

   ```python
   define_macros += [('GPAW_CUDA', '1')]
   libraries += [
           'gpaw-cuda',
           'cublas',
           'cudart',
           'stdc++'
   ]
   library_dirs += [
           './c/cuda',
           '/path/to/cuda/lib64'
   ]
   include_dirs += [
           '/path/to/cuda/include'
   ]
   ```