From 579ce6cd47880b746b554dca9d264b18d6819752 Mon Sep 17 00:00:00 2001
From: Vic <vic.toad.tor@free.fr>
Date: Thu, 6 Apr 2017 06:19:15 +0000
Subject: [PATCH] Add new file

---
 pfarm/README.md | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)
 create mode 100644 pfarm/README.md

diff --git a/pfarm/README.md b/pfarm/README.md
new file mode 100644
index 0000000..65f854e
--- /dev/null
+++ b/pfarm/README.md
@@ -0,0 +1,116 @@
+========================================================================
+README file for PRACE Accelerator Benchmark Code PFARM (stage EXDIG, program RMX95)
+========================================================================
+Author: Andrew Sunderland (a.g.sunderland@stfc.ac.uk).
+
+The code download should contain the following directories:
+benchmark/RMX_HOST: RMX source files for running on Host or KNL (using LAPACK or MKL)
+benchmark/RMX_MAGMA_GPU: RMX source for running on GPUs using MAGMA
+benchmark/lib: 
+benchmark/run: run directory with input files
+benchmark/xdr: XDR library src files
+
+The code uses the eXternal Data Representation library (XDR) for cross-platform
+compatibility of unformatted data files. The XDR source files are provided with this code bundle.
+and can be obtained from various sources, including
+http://people.redhat.com/rjones/portablexdr/
+
+----------------------------------------------------------------------------
+* Installing (MAGMA GPU Only)
+Download MAGMA (current version magma-2.2.0)  from http://icl.utk.edu/magma/
+Install MAGMA : Modify the make.inc file to indicate your C/C++
+ compiler, Fortran compiler, and where CUDA, CPU BLAS, and 
+ LAPACK are installed on your system. Refer to MAGMA documentation for further details
+----------------------------------------------------------------------------
+* Install XDR
+build XDR library: 
+update DEFS file for your compiler and environment
+$> make
+----------------------------------------------------------------------------
+* Install RMX_HOST
+Update DEFS file for your setup, ensuring you are linking to a LAPACK or MKL library.
+This is usually facilitated by e.g. compiling with -mkl=parallel (Intel compiler) or loading the appropriate library modules. 
+$> cd RMX_HOST
+$> make
+
+* Install RMX_MAGMA_GPU 
+Update DEFS file for your setup:
+$> cd RMX_MAGMA_GPU
+$> make
+
+set MAGMADIR, (CUDADIR , OPENBLASDIR ) environment variables
+Build RMX application after updating the fortran compiler and 
+ flags in DEFS file.
+----------------------------------------------------------------------------
+* Run RMX
+
+The RMX application can be run by running the executable "rmx95"
+For the FEIII dataset, the program requires the following input files to reside in the same directory as the executable:
+phzin.ctl
+XJTARMOM
+HXJ030
+
+These files are located in benchmark/run
+A guide to each of the variables in the namelist in phzin.ctl can be found at:
+https://hpcforge.org/plugins/mediawiki/wiki/pfarm/images/9/99/Phz_rep.pdf
+However, it is recommended that these inputs are not changed for the benchmark code and
+problem size, runtime etc, are controlled via the environment variables listed below.
+
+A typical PBS script to run the RMX_HOST benchmark on 4 KNL nodes (4 MPI tasks with 64 threads per MPI task) is listed below:
+Settings will vary according to your local environment.
+
+#PBS -N rmx95_4x64
+#PBS -l select=4
+#PBS -l walltime=01:00:00
+#PBS -A my_account_id
+
+cd $PBS_O_WORKDIR
+export OMP_NUM_THREADS=64
+
+aprun -N 1 -n 4 -d $OMP_NUM_THREADS ./rmx95
+
+----------------------------------------------------------------------------
+
+* Run-time environment variable settings
+
+The following environmental variables that e.g. can be set inside the script allow the H sector matrix 
+to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks.
+These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs.
+Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU).
+The distribution among MPI tasks is simple round-robin.
+ 
+RMX_NGPU : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU)
+RMX_NSECT_FINE : sets the number of sectors for the Fine region. 
+RMX_NSECT_COARSE : sets the number of sectors for the Coarse region. 
+RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations. 
+RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations. 
+
+Notes:
+For a representative setup for the benchmark datasets:
+
+RMX_NL_FINE  can take values in the range 6:25
+RMX_NL_COARSE  can take values in the range 5:10 
+For accuracy reasons, RMX_NL_FINE should always be great than RMX_NL_COARSE. 
+The following value pairs for RMX_NL_FINE and RMX_NL_COARSE provide representative calculations:
+
+12,6
+14,8
+16,10
+18,10
+20,10
+25,10
+
+If RMX_NSECT and RMX_NL variables are not set, the benchmark code defaults to:
+RMX_NSECT_FINE=5
+RMX_NSECT_COARSE=20
+RMX_NL_FINE=12
+RMX_NL_COARSE=6
+
+The Hamiltonian matrix dimension will be output along 
+with the Wallclock time it takes to do each individual DSYEVD call.
+
+Performance is measured in Wallclock time and is displayed 
+on the screen or output log at the end of the run. 
+
+
+----------------------------------------------------------------------------
\ No newline at end of file
-- 
GitLab