@@ -35,7 +35,7 @@ The application codes that constitute the UEABS are:
-[GROMACS](#gromacs)
-[NAMD](#namd)
-[NEMO](#nemo)
- PFARM
-[PFARM](#pfarm)
-[QCD](#qcd)
-[Quantum Espresso](#espresso)
-[SHOC](#shoc)
...
...
@@ -224,6 +224,30 @@ In this configuration, we use default value of 30 ocean levels depicted by jpk=3
* Web site: <http://www.nemo-ocean.eu/>
* Download, Build and Run Instructions : <https://repository.prace-ri.eu/git/UEABS/ueabs/tree/master/nemo>
# PFARM <a name="pfarm"></a>
PFARM is part of a suite of programs based on the ‘R-matrix’ ab-initio approach to the variational solution of the many-electron Schrödinger
equation for electron-atom and electron-ion scattering. The package has been used to calculate electron collision data for astrophysical
applications (such as: the interstellar medium, planetary atmospheres) with, for example, various ions of Fe and Ni and neutral O, plus
other applications such as data for plasma modelling and fusion reactor impurities. The code has recently been adapted to form a compatible
interface with the UKRmol suite of codes for electron (positron) molecule collisions thus enabling large-scale parallel ‘outer-region’
calculations for molecular systems as well as atomic systems.
The PFARM outer-region application code EXDIG is domi-nated by the assembly of sector Hamiltonian matrices and their subsequent eigensolutions.
The code is written in Fortran 2003 (or Fortran 2003-compliant Fortran 95), is parallelised using MPI and OpenMP and is designed to take
advantage of highly optimised, numerical library routines. Hybrid MPI / OpenMP parallelisation has also been introduced into the code via
shared memory enabled numerical library kernels.
Accelerator-based implementations have been implemented for EXDIG, using off-loading (MKL or CuBLAS/CuSolver) for the standard (dense) eigensolver calculations that dominate overall run-time.
- Build & Run instructions: https://repository.prace-ri.eu/git/UEABS/ueabs/blob/r2.1-dev/pfarm/PFARM_Build_Run_README.txt
- Test Case A: https://repository.prace-ri.eu/UEABS/ueabs/pfarm/PFARM_TestCaseA.tar.bz2
- Test Case B: https://repository.prace-ri.eu/UEABS/ueabs/pfarm/PFARM_TestCaseB.tar.bz2
# QCD <a name="qcd"></a>
The QCD benchmark is, unlike the other benchmarks in the PRACE application benchmark suite, not a full application but a set of 5 kernels which are representative of some of the most compute-intensive parts of QCD calculations.
The following environmental variables that e.g. can be set inside the script allow the H sector matrix
to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks.
These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs.
Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU).
The distribution among MPI tasks is simple round-robin.
RMX_NGPU : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU)
RMX_NSECT_FINE : sets the number of sectors for the Fine region (it is recommended to set this to a low number if the sector Hamiltonian matrix dimension is large).
RMX_NSECT_COARSE : sets the number of sectors for the Coarse region (it is recommended to set this to a low number if the sector Hamiltonian matrix dimension is large).
RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations (this will determine the size of the sector Hamiltonian matrix).
RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations (this will determine the size of the sector Hamiltonian matrix).
Hint: To aid scaling across nodes, the number of MPI tasks in the job script should ideally be a factor of RMX_NSECT_FINE.
For representative test cases:
RMX_NL_FINE should take values in the range 6:25
RMX_NL_COARSE should take values in the range 5:10
For accuracy reasons, RMX_NL_FINE should always be great than RMX_NL_COARSE.
The following value pairs for RMX_NL_FINE and RMX_NL_COARSE provide representative calculations:
12,6
14,8
16,10
18,10
20,10
25,10
If RMX_NSECT and RMX_NL variables are not set, the benchmark code defaults to calculating NL and NSECT, giving:
RMX_NSECT_FINE=5
RMX_NSECT_COARSE=20
RMX_NL_FINE=12
RMX_NL_COARSE=6
* Results
1 AMPF file will be created for each fine-region sector
1 AMPC file will be created for each coarse-region sector
All output AMPF files will be the same size and all output AMPC files will be the same size (bytes).
The Hamiltonian matrix dimension will be output along
with the Wallclock time it takes to do each individual DSYEVD call.
Performance is measured in Wallclock time and is displayed
on the screen or output log at the end of the run.
# PFARM in the United European Applications Benchmark Suite (UEABS)
## Document Author: Andrew Sunderland (andrew.sunderland@stfc.ac.uk) , STFC, UK.
## Introduction
PFARM is part of a suite of programs based on the ‘R-matrix’ ab-initio approach to the vari-tional solution of the many-electron Schrödinger equation for electron-atom and electron-ion scattering. The package has been used to calculate electron collision data for astrophysical applications (such as: the interstellar medium, planetary atmospheres) with, for example, var-ious ions of Fe and Ni and neutral O, plus other applications such as data for plasma model-ling and fusion reactor impurities. The code has recently been adapted to form a compatible interface with the UKRmol suite of codes for electron (positron) molecule collisions thus ena-bling large-scale parallel ‘outer-region’ calculations for molecular systems as well as atomic systems.
In this README we give information relevant for its use in the UEABS.
### Standard CPU version
The PFARM outer-region application code EXDIG is domi-nated by the assembly of sector Hamiltonian matrices and their subsequent eigensolutions. The code is written in Fortran 2003 (or Fortran 2003-compliant Fortran 95), is parallelised using MPI and OpenMP and is designed to take advantage of highly optimised, numerical library routines. Hybrid MPI / OpenMP parallelisation has also been introduced into the code via shared memory enabled numerical library kernels.
### GPU version
Accelerator-based implementations have been implemented for EXDIG, using off-loading (MKL or CuBLAS/CuSolver) for the standard (dense) eigensolver calculations that dominate overall run-time.