From 579ce6cd47880b746b554dca9d264b18d6819752 Mon Sep 17 00:00:00 2001 From: Vic Date: Thu, 6 Apr 2017 06:19:15 +0000 Subject: [PATCH] Add new file --- pfarm/README.md | 116 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 pfarm/README.md diff --git a/pfarm/README.md b/pfarm/README.md new file mode 100644 index 0000000..65f854e --- /dev/null +++ b/pfarm/README.md @@ -0,0 +1,116 @@ +======================================================================== +README file for PRACE Accelerator Benchmark Code PFARM (stage EXDIG, program RMX95) +======================================================================== +Author: Andrew Sunderland (a.g.sunderland@stfc.ac.uk). + +The code download should contain the following directories: +benchmark/RMX_HOST: RMX source files for running on Host or KNL (using LAPACK or MKL) +benchmark/RMX_MAGMA_GPU: RMX source for running on GPUs using MAGMA +benchmark/lib: +benchmark/run: run directory with input files +benchmark/xdr: XDR library src files + +The code uses the eXternal Data Representation library (XDR) for cross-platform +compatibility of unformatted data files. The XDR source files are provided with this code bundle. +and can be obtained from various sources, including +http://people.redhat.com/rjones/portablexdr/ + +---------------------------------------------------------------------------- +* Installing (MAGMA GPU Only) +Download MAGMA (current version magma-2.2.0) from http://icl.utk.edu/magma/ +Install MAGMA : Modify the make.inc file to indicate your C/C++ + compiler, Fortran compiler, and where CUDA, CPU BLAS, and + LAPACK are installed on your system. Refer to MAGMA documentation for further details +---------------------------------------------------------------------------- +* Install XDR +build XDR library: +update DEFS file for your compiler and environment +$> make +---------------------------------------------------------------------------- +* Install RMX_HOST +Update DEFS file for your setup, ensuring you are linking to a LAPACK or MKL library. +This is usually facilitated by e.g. compiling with -mkl=parallel (Intel compiler) or loading the appropriate library modules. +$> cd RMX_HOST +$> make + +* Install RMX_MAGMA_GPU +Update DEFS file for your setup: +$> cd RMX_MAGMA_GPU +$> make + +set MAGMADIR, (CUDADIR , OPENBLASDIR ) environment variables +Build RMX application after updating the fortran compiler and + flags in DEFS file. +---------------------------------------------------------------------------- +* Run RMX + +The RMX application can be run by running the executable "rmx95" +For the FEIII dataset, the program requires the following input files to reside in the same directory as the executable: +phzin.ctl +XJTARMOM +HXJ030 + +These files are located in benchmark/run +A guide to each of the variables in the namelist in phzin.ctl can be found at: +https://hpcforge.org/plugins/mediawiki/wiki/pfarm/images/9/99/Phz_rep.pdf +However, it is recommended that these inputs are not changed for the benchmark code and +problem size, runtime etc, are controlled via the environment variables listed below. + +A typical PBS script to run the RMX_HOST benchmark on 4 KNL nodes (4 MPI tasks with 64 threads per MPI task) is listed below: +Settings will vary according to your local environment. + +#PBS -N rmx95_4x64 +#PBS -l select=4 +#PBS -l walltime=01:00:00 +#PBS -A my_account_id + +cd $PBS_O_WORKDIR +export OMP_NUM_THREADS=64 + +aprun -N 1 -n 4 -d $OMP_NUM_THREADS ./rmx95 + +---------------------------------------------------------------------------- + +* Run-time environment variable settings + +The following environmental variables that e.g. can be set inside the script allow the H sector matrix +to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks. +These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs. +Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU). +The distribution among MPI tasks is simple round-robin. + +RMX_NGPU : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU) +RMX_NSECT_FINE : sets the number of sectors for the Fine region. +RMX_NSECT_COARSE : sets the number of sectors for the Coarse region. +RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations. +RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations. + +Notes: +For a representative setup for the benchmark datasets: + +RMX_NL_FINE can take values in the range 6:25 +RMX_NL_COARSE can take values in the range 5:10 +For accuracy reasons, RMX_NL_FINE should always be great than RMX_NL_COARSE. +The following value pairs for RMX_NL_FINE and RMX_NL_COARSE provide representative calculations: + +12,6 +14,8 +16,10 +18,10 +20,10 +25,10 + +If RMX_NSECT and RMX_NL variables are not set, the benchmark code defaults to: +RMX_NSECT_FINE=5 +RMX_NSECT_COARSE=20 +RMX_NL_FINE=12 +RMX_NL_COARSE=6 + +The Hamiltonian matrix dimension will be output along +with the Wallclock time it takes to do each individual DSYEVD call. + +Performance is measured in Wallclock time and is displayed +on the screen or output log at the end of the run. + + +---------------------------------------------------------------------------- \ No newline at end of file -- GitLab