Run instructions for CP2K.
2016-10-26

After building the hybrid MPI+OpenMP version of CP2K you have an executable
called cp2k.psmp. The general way to run the benchmarks is:

export OMP_NUM_THREADS=##
parallel_launcher launcher_options path_to_cp2k.psmp -i inputfile -o logfile

Where:

o The parallel_launcher is mpirun, mpiexec, or some variant such as aprun on
  Cray systems or srun when using Slurm. 

o The launcher_options include the parallel placement in terms of total numbers
  of nodes, MPI ranks/tasks, tasks per node, and OpenMP threads per task (which
  should be equal to the value given to OMP_NUM_THREADS)

You can try any combination of tasks per node and OpenMP threads per task to
investigate absolute performance and scaling on the machine of interest. 

For tier-1 systems the best performance is usually obtained with pure MPI,while
for tier-0 systems the best performance is typically obtained using 1 MPI task
per node with the number of threads being equal to the number of cores per node.

More information in the form of a README and an example job script is included
in each benchmark tar file.

The run walltime is reported near the end of logfile:
grep "CP2K    " logfile | awk -F ' ' '{print $7}'
