1. The `CMAKE_INSTALL_PREFIX` to point to a different path
2.`GMX_SIMD` : You may completely ommit this if your compile and compute nodes are of the same architecture (for example Haswell).
If they are different you should specify what fits to your compute nodes.
For a complete and up to date list of possible choices refer to the gromacs official build instructions.
Typical values are `AVX_256` for Ivybridge, `AVX2_256` for Haswell, `AVX_512_KNL` for KNL For CUDA build one should change `-DGMX_GPU=on`, and cuda bin directory should be in path.
## [3. Run](#run)
There are two data sets in UEABS for Gromacs.
1.`ion_channel` that use PME for electrostatics, for Tier-1 systems
2.`lignocellulose-rf` that use Reaction field for electrostatics, for Tier-0 systems. Reference : [http://pubs.acs.org/doi/abs/10.1021/bm400442n](http://pubs.acs.org/doi/abs/10.1021/bm400442n)
The input data file for each benchmark is the corresponding .tpr file produced using tools from a complete gromacs installation and a series of ascii data files (atom coords/velocities, forcefield, run control).
If it happens to run the tier-0 case on BG/Q use `lignucellulose-rf.BGQ.tpr`
instead `lignocellulose-rf.tpr`. It is the same as `lignocellulose-rf.tpr`
- Slurm : `srun` with no options, obtained from slurm if the variables below are set.
```
#SBATCH --nodes=NODES
#SBATCH --ntasks-per-node=TASKSPERNODE
#SBATCH --cpus-per-task=THREADSPERTASK
```
The best performance is usually obtained using pure MPI i.e. `THREADSPERTASK=1`.
You can check other hybrid MPI/OMP combinations.
The execution time is reported at the end of logfile : `grep Time: logfile | awk -F ' ' '{print $3}'`
> **NOTE** : This is the wall time for the last half number of steps.
For sufficiently large nsteps, this is half of the total wall time.
In order to use GPU acceleration, one needs to add in gmx mdrun options the `-gpu_id GPU_IDS`. `GPU_IDS` value depends on how many MPI Tasks and gpus are used per node.
For example, using 4 GPUs per node with 4 Taks per node, the GPU_IDS should be 0123.
In order to run on a 20 core node with 20 gpus with pure MPI i.e. 20 tasks/node,
GPU_IDS should be 0000000000111111111 (10 zeroes and 10 ones)