From 427bcc5bc979685a1f97d201afbea2d55e3380f7 Mon Sep 17 00:00:00 2001 From: "petros.anastasiadis" Date: Thu, 19 Oct 2017 14:00:45 +0300 Subject: [PATCH] Readme Compile Instr --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c76ee42..40f6996 100644 --- a/README.md +++ b/README.md @@ -36,4 +36,5 @@ To further scale in multiple nodes, we use a non-shared memory model tool, MPI ( Finally, we implement our base-algorithm with CUDA in a Nvidia GPU(cuda_SingleGPU.cu + dmv_gpu.cu). We invoke 3 different kernels, starting from a simple-naive one and improving him as we go (in the second kernel we transpose the matrix to achieve coalesced memory access, and in the third one we also use the block shared memory (shmem) to utilize bandwidth better). To test our implementations we also implement a cuBLAS (Nvidia parallel BLAS routine library) version (cuBLAS_SingleGPU.cu). Then, we create a final hybrid cuBlAS-MPI version (cuBLAS_MultiGPU.cu) in order to utilize a possible multi-gpu/node architecture (MPI inter-process communication is still a big problem for the Matrix-Vector kernel, but in a more computational intensive scenario a huge scale-up is possible). ## Compilation/Running -All executables can be created by running the Makefiles in the corresponding directories. There is also a global-maker in the project root directory. Every program directory contains a slurm file for execution in the ARIS system (for other systems corresponding adjustments must be made). +All executables can be created by running the Makefiles in the corresponding directories. There is also a global-maker in the project root directory. Every program directory contains a slurm file for execution in the ARIS system (for other systems corresponding adjustments must be made). Compilation is performed with intel and cuda compilers ( icc, mpicc, nvcc ), so in a system without the above the makefiles must be modified accordingly ( icc -> gcc, nvcc cannot be replaced), and aditional compile options might be required. + -- GitLab