=========================================================== Build instructions for PRACE Accelerator Benchmark for GPAW =========================================================== GPAW is a density-functional theory (DFT) program for ab initio electronic structure calculations using the projector augmented wave method. It is written mostly in Python and uses MPI for parallelisation. GPAW is licensed under GPL and is freely available at: https://wiki.fysik.dtu.dk/gpaw/ https://gitlab.com/gpaw/gpaw Generic installation instructions can be found at: https://wiki.fysik.dtu.dk/gpaw/install.html For platform specific examples, please refer to: https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html Accelerator specific instructions and requirements are given in more detail below for each architecture. GPGPUs ====== GPAW has a separate CUDA version available for Nvidia GPGPUs. Nvidia has released multiple versions of the CUDA toolkit. In this work just CUDA 7.5 was tested with Tesla K20, Tesla K40 and Tesla K80 cards. Source code is available in GPAW's repository as a separate branch called 'cuda'. To obtain the code, use e.g. the following commands: git clone https://gitlab.com/gpaw/gpaw.git cd gpaw git checkout cuda or download it from: https://gitlab.com/gpaw/gpaw/tree/cuda Alternatively, the source code is also available with minor modifications required to work with newer versions of CUDA and Libxc (incl. example installation settings) at: https://gitlab.com/atekin/gpaw-cuda.git Patches needed to work with newer versions of CUDA and example setup scripts for GPAW using dynamic links to Libxc are available also separately at: https://github.com/mlouhivu/gpaw-cuda-patches.git Software requirements --------------------- CUDA branch of the GPAW is based on version 0.9.1.13528 and has similar software requirements to the main branch. To compile the code, one needs to use Intel compile environment. For example, the following versions are known to work: * Intel compile environment with Intel MKL and Intel MPI (2015 update 3) * Python (2.7.11) * ASE (3.9.1) * Libxc (3.0.0) * CUDA (7.5) * HDF5 (1.8.16) * Pycuda (2016.1.2) Install instructions -------------------- Before installing the CUDA version of GPAW, the required packages should be compiled using Intel compilers. In addition to using Intel compilers, there are two additional steps compared to a standard installation: 1) Compile the CUDA files after preparing a suitable make.inc (in c/cuda/) by modifying the default options and paths to match your system. The following compiler options may offer a good starting point. """ CC = icc CCFLAGS = $(CUGPAW_DEFS) -fPIC -std=c99 -m64 -O3 NVCC = nvcc -ccbin=icpc NVCCFLAGS = $(CUGPAW_DEFS) -O3 -arch=sm_20 -m64 --compiler-options '-fPIC -O3' """ To use a dynamic link to Libxc, please add a corresponding include flag to the CUGPAW_INCLUDES (e.g. '-I/path/to/libxc/include') and a It is possible that you may also need to add additional include flags for MKL and Libxc in CUGPAW_INCLUDES (e.g. '-I/path/to/mkl/include'). After making the necessary changes, simply run make (in the c/cuda path). 2) Edit your GPAW setup script (customize.py) to add correct link and compile options for CUDA. The relevant lines are e.g.: """ define_macros += [('GPAW_CUDA', '1')] libraries += [ 'gpaw-cuda', 'cublas', 'cudart', 'stdc++' ] library_dirs += [ './c/cuda', '/path/to/cuda/lib64' ] include_dirs += [ '/path/to/cuda/include' ] """ Xeon Phi MICs ============= Intel's MIC architecture has currently two distinct generations of processors: 1st generation Knights Corner (KNC) and 2nd generation Knights Landing (KNL). KNCs require a specific offload version of GPAW, whereas KNLs use standard GPAW. -------------------- KNC (Knights Corner) -------------------- For KNCs, GPAW has adopted an offload-to-the-MIC-co-processor approach similar to GPGPUs. The offload version of GPAW uses the stream-based offload module pyMIC (https://github.com/01org/pyMIC) to offload computationally intensive matrix calculations to the MIC co-processors. Source code is available in GPAW's repository as a separate branch called 'mic'. To obtain the code, use e.g. the following commands: git clone https://gitlab.com/gpaw/gpaw.git cd gpaw git checkout mic or download it from: https://gitlab.com/gpaw/gpaw/tree/mic A ready-to-use install package with examples and instructions is also available at: https://github.com/mlouhivu/gpaw-mic-install-pack.git Software requirements --------------------- The offload version of GPAW is roughly equivalent to the 0.11.0 version of GPAW and thus has similar requirements (for software and versions). For example, the following versions are known to work: * Python (2.7.x) * ASE (3.9.1) * NumPy (1.9.2) * Libxc (2.1.x) In addition, pyMIC requires: * Intel compile environment with Intel MKL and Intel MPI * Intel MPSS (Manycore Platform Software Stack) Install instructions -------------------- In addition to using Intel compilers, there are three additional steps apart from standard installation: 1) Compile and install Numpy with a suitable site.cfg to use MKL, e.g. """ [mkl] library_dirs = /path/to/mkl/lib/intel64 include_dirs = /path/to/mkl/include lapack_libs = mkl_libs = mkl_rt """ 2) Compile and install pyMIC before GPAW. 3) Edit your GPAW setup script (customize.py) to add correct link and compile options for offloading. The relevant lines are e.g.: """ # offload to KNC extra_compile_args += ['-qoffload-option,mic,compiler,"-qopenmp"'] extra_compile_args += ['-qopt-report-phase=offload'] # linker settings for MKL on KNC mic_mkl_lib = '/path/to/mkl/lib/mic/' extra_link_args += ['-offload-option,mic,link,"-L' + mic_mkl_lib \ + ' -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread"'] """ --------------------- KNL (Knights Landing) --------------------- For KNLs, one can use the standard version of GPAW, instead of the offload version used for KNCs. Please refer to the generic installation instructions for GPAW. Software requirements --------------------- https://wiki.fysik.dtu.dk/gpaw/install.html Install instructions -------------------- https://wiki.fysik.dtu.dk/gpaw/install.html https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html It is advisable to use Intel compile environment with Intel MKL and Intel MPI to take advantage of their KNL optimisations. To enable the AVX512 vector sets supported by KNLs, one needs to use the compiler option '-xMIC-AVX512' when installing GPAW. To improve performance, one may also link to Intel TBB to benefit from an optimised memory allocator (tbbmalloc). This can be done during installation or at run-time by setting environment variable LD_PRELOAD to point to the correct libraries, i.e. for example: export LD_PRELOAD=$TBBROOT/lib/intel64/gcc4.7/libtbbmalloc_proxy.so.2 export LD_PRELOAD=$LD_PRELOAD:$TBBROOT/lib/intel64/gcc4.7/libtbbmalloc.so.2 It may also be beneficial to use hugepages together with tbbmalloc (export TBB_MALLOC_USE_HUGE_PAGES=1).