Commit 885c8dad authored by Dimitris Dellis's avatar Dimitris Dellis
Browse files


parent b0c0da74
# Build and Run instructions for gromacs
## Build
Complete Build instructions: [](
A typical build procedure looks like :
tar -zxf gromacs-2016.tar.gz
cd gromacs-2016
mkdir build
cd build
cmake \
-DCMAKE_INSTALL_PREFIX=$HOME/Packages/gromacs/2016 \
-DCMAKE_C_COMPILER=`which mpicc` \
-DCMAKE_CXX_COMPILER=`which mpicxx` \
-DGMX_GPU=off \
-DGMX_MPI=on \
-DGMX_X11=off \
make (or make -j ##)
make install
You probably need to adjust
1. The `CMAKE_INSTALL_PREFIX `to point to a different path
2. `GMX_SIMD` : You may completely ommit this if your compile and compute nodes are of the same architecture (for example Haswell).
If they are different you should specify what fits to your compute nodes.For a complete and up to date list of possible choices refer to the gromacs official build instructions.
Typical values are `AVX_256` for Ivybridge, `AVX2_256` for Haswell, `AVX_512_KNL` for KNL
For CUDA build one should change `-DGMX_GPU=on`, and cuda bin directory should be in path.
## Run
There are two data sets in UEABS for Gromacs.
* `ion_channel` that use PME for electrostatics, for Tier-1 systems
* `lignocellulose-rf` that use Reaction field for electrostatics, for Tier-0 systems. Reference :
The input data file for each benchmark is the corresponding .tpr file produced using tools from a complete gromacs installation and a series of ascii data files (atom coords/velocities, forcefield, run control).
If it happens to run the tier-0 case on BG/Q use `lignucellulose-rf.BGQ.tpr`
instead `lignocellulose-rf.tpr`. It is the same as `lignocellulose-rf.tpr`
created on a BG/Q system.
The general way to run gromacs benchmarks is :
* `WRAPPER` `WRAPPER_OPTIONS PATH_TO_GMX mdrun -s CASENAME.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logile`
* `CASENAME` is one of ion_channel or lignocellulose-rf
* `maxh` : Terminate after 0.99 times this time (hours) i.e. gracefully terminate after ~30 min
* `resethwat` : Reset Timer counters at half steps. This means that the reported walltime and performance referes to the last half steps of sumulation.
* `noconfout ` : Do not save output coordinates/velocities at the end.
* `nsteps ` : Run this number of steps, no matter what is requested in the input file
* `logfile ` : The output filename. If extension .log is ommited
it is automatically appended. Obviously, it should be different
for different runs.
WRAPPER and WRAPPER_OPTIONS depend on system, batch system etc.
Few common pairs are :
* Curie : `ccc_mrun` with no options - obtained from batch system
* Juqueen : `runjob --np TASKS --ranks-per-node TASKSPERNOD --exp-env OMP_NUM_THREADS`
* Slurm : `srun` with no options, obtained from slurm if the variables below are set.
#SBATCH --ntasks-per-node=TASKSPERNODE
The best performance is usually obtained using pure MPI i.e. `THREADSPERTASK=1.`
You can check other hybrid MPI/OMP combinations.
The execution time is reported at the end of logfile : `grep Time: logfile | awk -F ' ' '{print $3}'`
> NOTE : This is the wall time for the last half number of steps.
For sufficiently large nsteps, this is half of the total wall time.
In order to use GPU acceleration, one needs to add in gmx mdrun options the `-gpu_id GPU_IDS`.
GPU_IDS value depends on how many MPI Tasks and gpus are used per node.
For example, using 4 GPUs per node with 4 Taks per node, the GPU_IDS should be 0123.
In order to run on a 20 core node with 20 gpus with pure MPI i.e. 20 tasks/node,
GPU_IDS should be 0000000000111111111 (10 zeroes and 10 ones)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment