build-xeon-phi.md

Xeon Phi MICs
=============

Intel's MIC architecture includes two distinct generations of processors:
1st generation Knights Corner (KNC) and 2nd generation Knights Landing (KNL).
KNCs require a specific offload version of GPAW, whereas KNLs use standard
GPAW.


KNC (Knights Corner)
--------------------

For KNCs, GPAW has adopted an offload-to-the-MIC-coprocessor approach similar
to GPGPUs. The offload version of GPAW uses the stream-based offload module
pyMIC (https://github.com/01org/pyMIC) to offload computationally intensive
matrix calculations to the MIC co-processors.

Source code is available in GPAW's repository as a separate branch called
'mic'. To obtain the code, use e.g. the following commands:
```
git clone https://gitlab.com/gpaw/gpaw.git
cd gpaw
git checkout mic
```
or download it from: https://gitlab.com/gpaw/gpaw/tree/mic

A ready-to-use install package with examples and instructions is also
available at:
  https://github.com/mlouhivu/gpaw-mic-install-pack.git


### Software requirements

The offload version of GPAW is roughly equivalent to the 0.11.0 version of
GPAW and thus has similar requirements (for software and versions).

For example, the following versions are known to work:
* Python (2.7.x)
* ASE (3.9.1)
* NumPy (1.9.2)
* Libxc (2.1.x)

In addition, pyMIC requires:
* Intel compile environment with Intel MKL and Intel MPI
* Intel MPSS (Manycore Platform Software Stack)

### Install instructions

In addition to using Intel compilers, there are three additional steps apart
from standard installation:

1. Compile and install Numpy with a suitable `site.cfg` to use MKL, e.g.

   ```
   [mkl]
   library_dirs = /path/to/mkl/lib/intel64
   include_dirs = /path/to/mkl/include
   lapack_libs =
   mkl_libs = mkl_rt
   ```

2. Compile and install [pyMIC](https://github.com/01org/pyMIC) before GPAW.

3. Edit your GPAW setup script (`customize.py`) to add correct link and
   compile options for offloading. The relevant lines are e.g.:

   ```python
   # offload to KNC
   extra_compile_args += ['-qoffload-option,mic,compiler,"-qopenmp"']
   extra_compile_args += ['-qopt-report-phase=offload']

   # linker settings for MKL on KNC
   mic_mkl_lib = '/path/to/mkl/lib/mic/'
   extra_link_args += ['-offload-option,mic,link,"-L' + mic_mkl_lib \
           + ' -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread"']
   ```


KNL (Knights Landing)
---------------------

For KNLs, one can use the standard version of GPAW, instead of the offload
version used for KNCs. Please refer to the generic installation instructions
for GPAW.

### Software requirements
  - https://wiki.fysik.dtu.dk/gpaw/install.html

### Install instructions
  - https://wiki.fysik.dtu.dk/gpaw/install.html
  - https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html

It is advisable to use Intel compile environment with Intel MKL and Intel MPI
to take advantage of their KNL optimisations. To enable the AVX512 vector
sets supported by KNLs, one needs to use the compiler option `-xMIC-AVX512`
when installing GPAW.

To improve performance, one may also link to Intel TBB to benefit from an
optimised memory allocator (tbbmalloc). This can be done during installation
or at run-time by setting environment variable LD_PRELOAD to point to the
correct libraries, i.e. for example:
```bash
export LD_PRELOAD=$TBBROOT/lib/intel64/gcc4.7/libtbbmalloc_proxy.so.2
export LD_PRELOAD=$LD_PRELOAD:$TBBROOT/lib/intel64/gcc4.7/libtbbmalloc.so.2
```

It may also be beneficial to use hugepages together with tbbmalloc
(`export TBB_MALLOC_USE_HUGE_PAGES=1`).