@@ -130,15 +130,21 @@ users wish to experiment with settings there is a guide here.
The following environmental variables that e.g. can be set inside the script allow the H sector matrix
to easily change dimensions and also allows the number of sectors to change when undertaking benchmarks.
These can be adapted by the user to suit benchmark load requirements e.g. short vs long runs.
Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL) or offloaded (for GPU). The maximum number of MPI tasks for a region calculation should not exceed the number of sectors specified. There is no limit for threads, though for efficint performance on current hardware, it would be recommended to set between 16 to 64 threads per MPI tasks.
The distribution among MPI tasks is simple round-robin.
Each MPI Task will pickup a sector calculation which will then be distributed amongst available threads per node (for CPU and KNL)
or offloaded (for GPU). The maximum number of MPI tasks for a region calculation should not exceed the number of sectors specified.
There is no limit for threads, though for efficient performance on current hardware, it would be generaly recommended to set between
16 to 64 threads per MPI tasks. The distribution of sectors among MPI tasks is simple round-robin.
RMX_NGPU : refers to the number of shared GPUs per node (only for RMX_MAGMA_GPU)
RMX_NSECT_FINE : sets the number of sectors for the Fine region (e.g. 16 for smaller runs, 256 for larger-scale runs). The molecular case is limited to a maximum of 512 sectors for this benchmark.
RMX_NSECT_COARSE : sets the number of sectors for the Coarse region (e.g. 16 for smaller runs, 256 for larger-scale runs). The molecular case is limited to a maximum of 512 sectors for this benchmark.
RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations (this will determine the size of the sector Hamiltonian matrix).
RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations (this will determine the size of the sector Hamiltonian matrix).
Hint: To aid scaling across nodes, the number of MPI tasks in the job script should ideally be a factor of RMX_NSECT_FINE.
RMX_NSECT_FINE : sets the number of sectors for the Fine region (e.g. 16 for smaller runs, 256 for larger-scale runs).
The molecular case is limited to a maximum of 512 sectors for this benchmark.
RMX_NSECT_COARSE : sets the number of sectors for the Coarse region (e.g. 16 for smaller runs, 256 for larger-scale runs).
The molecular case is limited to a maximum of 512 sectors for this benchmark.
RMX_NL_FINE : sets the number of basis functions for the Fine region sector calculations
(this will determine the size of the sector Hamiltonian matrix for the Fine region calculations).
RMX_NL_COARSE : sets the number of basis functions for the Coarse region sector calculations
(this will determine the size of the sector Hamiltonian matrix for the Coarse region calculations).
Hint: To aid ideal scaling across nodes, the number of MPI tasks in the job script should ideally be a factor of RMX_NSECT_FINE.
For representative test cases:
RMX_NL_FINE should take values in the range 6:25
...
...
@@ -178,7 +184,8 @@ on the screen or output log at the end of the run.
** Validation of Results
For the atomic dataset runs, run the atomic problem configuration supplied in the 'example_job_scripts' directory . From the results directory issue the command:
For the atomic dataset runs, run the atomic problem configuration supplied in the 'example_job_scripts' directory .
@@ -191,7 +198,8 @@ Mesh 1, Sector 16: first five eigenvalues = -4329.7161 -4170.9100 -415
Mesh 2, Sector 16: first five eigenvalues = -313.6307 -301.0096 -298.8824 -293.3929 -290.6190
Mesh 2, Sector 16: final five eigenvalues = 290.6190 293.3929 298.8824 301.0102 313.6307
For the molecular dataset runs, run the molecular problem configuration supplied in 'example_job_scripts' directory. From the results directory issue the command:
For the molecular dataset runs, run the molecular problem configuration supplied in 'example_job_scripts' directory.