The following environmental variables that e.g. can be set inside the script allow the H sector matrix
to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks.
These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs.
Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU).
The distribution among MPI tasks is simple round-robin.
RMX_NGPU : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU)
RMX_NSECT_FINE : sets the number of sectors for the Fine region.
RMX_NSECT_COARSE : sets the number of sectors for the Coarse region.
RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations.
RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations.
Notes:
For a representative setup for the benchmark datasets:
RMX_NL_FINE can take values in the range 6:25
RMX_NL_COARSE can take values in the range 5:10
For accuracy reasons, RMX_NL_FINE should always be great than RMX_NL_COARSE.
The following value pairs for RMX_NL_FINE and RMX_NL_COARSE provide representative calculations:
12,6
14,8
16,10
18,10
20,10
25,10
If RMX_NSECT and RMX_NL variables are not set, the benchmark code defaults to:
RMX_NSECT_FINE=5
RMX_NSECT_COARSE=20
RMX_NL_FINE=12
RMX_NL_COARSE=6
The Hamiltonian matrix dimension will be output along
with the Wallclock time it takes to do each individual DSYEVD call.
Performance is measured in Wallclock time and is displayed
on the screen or output log at the end of the run.