The Scalable HeterOgeneous Computing (SHOC) benchmark suite is a collection of benchmark programs testing the performance and stability of systems using computing devices with non-traditional architectures
for general purpose computing. Its initial focus is on systems containing Graphics Processing Units (GPUs) and multi-core processors, and on the OpenCL programming standard. It can be used on clusters as well as individual hosts.
Also, SHOC includes an Offload branch for the benchmarks that can be used to evaluate the Intel Xeon Phi x100 family.
Documentation on configuring, building, and running the SHOC benchmark programs is contained in the SHOC user manual, in the doc subdirectory of the SHOC source code tree. The file INSTALL.txt contains a sketch of those instructions for rapid installation.
Installation should be familiar to anyone who is experienced with configure and make, see the config directory for some examples. Also, if your platform requires regenerating the configure script, see build-aux/bootstrap.sh and the manual for more details.
## Building SHOC for GPU accelerators
There are two versions that can be used for evaluating GPU accelerators, a CUDA and an OpenCL version.
### GPGPU Building
This version was tested with the Intel Compilers 2017.1 and NVIDIA Cuda 7.5. Ensure that the wrappers mpif90 and mpicc point to the correct binaries and that $CUDA_HOME is set.
SHOC can be used with MPI+CUDA or MPI+OpenCL parallelism. In order to compile the CUDA version, the steps are:
Running SHOC is actually quite straightforward. You can run single-device tests with something like:
$ $DEST_ROOT/bin/Serial/OpenCL/Sort
Run MPI-based multi-device tests with something like
$ mpirun -np 2 $DEST_ROOT/bin/EP/OpenCL/Sort
Use 1 MPI rank per accelerator device.
Alternatively, the bin folder contains a script to run the full
suite at once. The script depends on MPI programs being in your current
PATH, so be sure to set environment variables appropriately.
To run, you have to specify the benchmark size (1-4) and whether to use the CUDA or OpenCL versions
of the benchmarks. For cuda:
$ cd $DEST_ROOT
$ ./bin/shocdriver -s 2 -cuda
or the OpenCL version with:
$ cd $DEST_ROOT
$ ./bin/shocdriver -s 2 -opencl
To run the parallel versions of the benchmarks, supply the script with the
number of nodes and the number of devices per node. For example, for a 4
node cluster with 2 devices per node, size 1 benchmark problems, CUDA
versions, use:
$ cd $DEST_ROOT
$ ./bin/shocdriver -n 4 -d 2 -s 1 -cuda
Both scripts will output benchmarks results to a file in comma separated
value format to results.csv
## Building SHOC for Intel Xeon Phi Knights Corner (KNC)
Besides the OpenCL-based version, SHOC also contains a branch specifically designed for Intel's Knights Corner architecture, that is making use of the Offload model.
### KNC Building
This version was tested with the Intel Compilers 2016.1. SHOC can be used with both Offload mode and OpenCL mode on Knights Corner accelerators.
In order to build the OpenCL-mode KNC version, please use the following steps: