Updated README files

622d7a98 · Andrew Sunderland · 9dc25ef0 · 622d7a98 · 622d7a98 · 622d7a98
Commit 622d7a98 authored 4 years ago by Andrew Sunderland
--- a/pfarm/PFARM_Build_Run_README.txt
+++ b/pfarm/PFARM_Build_Run_README.txt
@@ -99,13 +99,16 @@ located in data/test_case_2_mol:
 phzin.ctl
 H

-
-A guide to each of the variables in the namelist in phzin.ctl can be found at:
-https://hpcforge.org/plugins/mediawiki/wiki/pfarm/images/9/99/Phz_rep.pdf
-However, it is recommended that these inputs are not changed for the benchmark runs and
+It is recommended that the settings in the input file phzin.ctl are not changed for the benchmark runs and
 problem size, runtime etc, are better controlled via the environment variables listed below.

-Example job scripts for cpu / gpu / atomic and molecular cases are provided in the directories
+To setup run directories with the correct executables and datafiles, bash script files are provided:
+cpu/setup_run_cpu_atom.scr
+cpu/setup_run_cpu_mol.scr
+gpu/setup_run_gpu_atom.scr
+gpu/setup_run_gpu_mol.scr
+
+Example submission job scripts for cpu / gpu / atomic and molecular cases are provided in the directories
 cpu/example_job_scripts
 gpu/example_job_scripts

@@ -162,7 +165,21 @@ The Hamiltonian matrix dimension will be output along
 with the Wallclock time it takes to do each individual DSYEVD (eigensolver) call.

 Performance is measured in Wallclock time and is displayed 
-on the screen or output log at the end of the run. 
+on the screen or output log at the end of the run.
+
+For the atomic dataset, grep the output file for 'Sector 16:'
+The output should match the values below.
+
+    Mesh 1, Sector 16: first five eigenvalues =   -4329.72      -4170.91      -4157.31      -4100.98      -4082.11
+    Mesh 1, Sector 16: final five eigenvalues =    4100.98       4157.31       4170.91       4329.72       4370.54
+    Mesh 2, Sector 16: first five eigenvalues =   -313.631      -301.010      -298.882      -293.393      -290.619
+    Mesh 2, Sector 16: final five eigenvalues =    290.619       293.393       298.882       301.010       313.631
+    
+For the molecular dataset, `grep` the output file for `'Sector 64:'`
+The output should match the values below.
+
+    Mesh 1, Sector 64: first five eigenvalues =   -3850.84      -3593.98      -3483.83      -3466.73      -3465.72
+    Mesh 1, Sector 64: final five eigenvalues =    3465.72       3466.73       3483.83       3593.99       3850.84


 ----------------------------------------------------------------------------
--- a/pfarm/README.md
+++ b/pfarm/README.md
@@ -7,10 +7,12 @@ PFARM is part of a suite of programs based on the ‘R-matrix’ ab-initio appro
 In this README we give information relevant for its use in the UEABS.

 ### Standard CPU version
-The PFARM outer-region application code EXDIG is domi-nated by the assembly of sector Hamiltonian matrices and their subsequent eigensolutions. The code is written in Fortran 2003 (or Fortran 2003-compliant Fortran 95), is parallelised using MPI and OpenMP and is designed to take advantage of highly optimised, numerical library routines. Hybrid MPI / OpenMP parallelisation has also been introduced into the code via shared memory enabled numerical library kernels. 
+The PFARM outer-region application code EXDIG is dominated by the assembly of sector Hamiltonian matrices and their subsequent eigensolutions. The code is written in Fortran 2003 (or Fortran 2003-compliant Fortran 95), is parallelised using MPI and OpenMP and is designed to take advantage of highly optimised, numerical library routines. Hybrid MPI / OpenMP parallelisation has also been introduced into the code via shared memory enabled numerical library kernels. 

 ### GPU version
-Accelerator-based implementations have been implemented for EXDIG, using off-loading (MKL or CuBLAS/CuSolver) for the standard (dense) eigensolver calculations that dominate overall run-time.
+Accelerator-based Nvidia GPU versions of the code using the MAGMA library for eigensolver calculations.

+### Configure, Build and Run Instructions
+See PFARM_Build_Run_README.txt


--- a/pfarm/README_ACC.md
+++ b/pfarm/README_ACC.md
@@ -2,143 +2,21 @@ README file for PRACE Accelerator Benchmark Code PFARM (stage EXDIG, program RMX
 ===================================================================================
 Author: Andrew Sunderland (a.g.sunderland@stfc.ac.uk).

-The [code download](https://www.dropbox.com/sh/dlcpzr934r0wazy/AABlphkgEn9tgRlwHY2k3lqBa?dl=0
-) should contain the following directories:
+# PFARM in the United European Applications Benchmark Suite (UEABS)
+## Document Author: Andrew Sunderland (andrew.sunderland@stfc.ac.uk) , STFC, UK.


-```
-benchmark/RMX_HOST: RMX source files for running on Host or KNL (using LAPACK or MKL)
-benchmark/RMX_MAGMA_GPU: RMX source for running on GPUs using MAGMA
-benchmark/lib: 
-benchmark/run: run directory with input files
-benchmark/xdr: XDR library src files
-```
+## Introduction
+PFARM is part of a suite of programs based on the ‘R-matrix’ ab-initio approach to the vari-tional solution of the many-electron Schrödinger equation for electron-atom and electron-ion scattering. The package has been used to calculate electron collision data for astrophysical applications (such as: the interstellar medium, planetary atmospheres) with, for example, var-ious ions of Fe and Ni and neutral O, plus other applications such as data for plasma model-ling and fusion reactor impurities. The code has recently been adapted to form a compatible interface with the UKRmol suite of codes for electron (positron) molecule collisions thus ena-bling large-scale parallel ‘outer-region’ calculations for molecular systems as well as atomic systems. 
+In this README we give information relevant for its use in the UEABS.

-The code uses the eXternal Data Representation library (XDR) for cross-platform
-compatibility of unformatted data files. The XDR source files are provided with this code bundle.
-It can be obtained from various sources, including
-http://people.redhat.com/rjones/portablexdr/
+### Standard CPU version
+The PFARM outer-region application code EXDIG is dominated by the assembly of sector Hamiltonian matrices and their subsequent eigensolutions. The code is written in Fortran 2003 (or Fortran 2003-compliant Fortran 95), is parallelised using MPI and OpenMP and is designed to take advantage of highly optimised, numerical library routines. Hybrid MPI / OpenMP parallelisation has also been introduced into the code via shared memory enabled numerical library kernels. 

+### GPU version
+Accelerator-based Nvidia GPU versions of the code using the MAGMA library for eigensolver calculations.

-Compilation
-***********
-Installing MAGMA (GPU Only)
---------------------------
-Download MAGMA (current version magma-2.2.0)  from http://icl.utk.edu/magma/
-Install MAGMA : Modify the make.inc file to indicate your C/C++
-compiler, Fortran compiler, and where CUDA, CPU BLAS, and 
-LAPACK are installed on your system. Refer to MAGMA documentation for further details
+### Configure, Build and Run Instructions
+See PFARM_Build_Run_README.txt

-Install XDR
-----------
-build XDR library: 
-update DEFS file for your compiler and environment
-
-```shell
-$> make
-```
-
-Install RMX_HOST
----------------
-Update DEFS file for your setup, ensuring you are linking to a LAPACK or MKL library.
-This is usually facilitated by e.g. compiling with `-mkl=parallel` (Intel compiler) or loading the appropriate library modules. 
-
-```shell
-$> cd RMX_HOST
-$> make
-```
-
-Install RMX_MAGMA_GPU 
---------------------
-Update DEFS file for your setup:
- - Set MAGMADIR, CUDADIR and OPENBLASDIR environment variables
- - Updating the fortran compiler and flags in DEFS file.
-
-```shell
-$> cd RMX_MAGMA_GPU
-$> make
-```
-
-
-Run instructions
-****************
-
-Run RMX
-------
-
-The RMX application can be run by running the executable `rmx95`
-For the FEIII dataset, the program requires the following input files to reside in the same directory as the executable:
-
-```
-phzin.ctl
-XJTARMOM
-HXJ030
-```
-
-These files are located in `benchmark/run`
-A guide to each of the variables in the namelist in phzin.ctl can be found at:
-https://hpcforge.org/plugins/mediawiki/wiki/pfarm/images/9/99/Phz_rep.pdf
-However, it is recommended that these inputs are not changed for the benchmark code and
-problem size, runtime etc, are controlled via the environment variables listed below.
-
-A typical PBS script to run the RMX_HOST benchmark on 4 KNL nodes (4 MPI tasks with 64 threads per MPI task) is listed below:
-Settings will vary according to your local environment.
-
-```shell
-#PBS -N rmx95_4x64
-#PBS -l select=4
-#PBS -l walltime=01:00:00
-#PBS -A my_account_id
-
-cd $PBS_O_WORKDIR
-export OMP_NUM_THREADS=64
-
-aprun -N 1 -n 4 -d $OMP_NUM_THREADS ./rmx95
-```
-
-Run-time environment variable settings
--------------------------------------
-The following environmental variables that e.g. can be set inside the script allow the H sector matrix 
-to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks.
-These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs.
-Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU).
-The distribution among MPI tasks is simple round-robin.
- 
- - `RMX_NGPU` : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU)
- - `RMX_NSECT_FINE` : sets the number of sectors for the Fine region. 
- - `RMX_NSECT_COARSE` : sets the number of sectors for the Coarse region. 
- - `RMX_NL_FINE` : sets the number of basis functions for the Fine region sector calculations. 
- - `RMX_NL_COARSE` : sets the number of basis functions for the Coarse region sector calculations. 
-
-**Notes**:
-For a representative setup for the benchmark datasets:
-
- - `RMX_NL_FINE`  can take values in the range 6:25
- - `RMX_NL_COARSE`  can take values in the range 5:10 
- - For accuracy reasons, `RMX_NL_FINE` should always be great than `RMX_NL_COARSE`. 
- - The following value pairs for `RMX_NL_FINE` and `RMX_NL_COARSE` provide representative calculations:
-
-```
-12,6
-14,8
-16,10
-18,10
-20,10
-25,10
-```
-
-If `RMX_NSECT` and `RMX_NL` variables are not set, the benchmark code defaults to:
-
-```
-RMX_NSECT_FINE=5
-RMX_NSECT_COARSE=20
-RMX_NL_FINE=12
-RMX_NL_COARSE=6
-```
-
-The Hamiltonian matrix dimension will be output along 
-with the Wallclock time it takes to do each individual DSYEVD call.
-
-Performance is measured in Wallclock time and is displayed 
-on the screen or output log at the end of the run.