# A single GPU impementation of the Matrix-Vector algorithm with: ``` ->cuBLAS(BLAS routines implemented on the GPU by NVIDIA) 07/09/2017: Completed 13/09/2017: Modified to use unified memory ->cuBLAS_MultiGPU(cuBLAS implementation in multiple GPUs/Nodes) 26/09/2017: Completed ->cuda_SingleGPU(3 cuda kernels showing the optimization steps in writing GPU code) 02/10/2017: Completed kernel 1 03/10/2017: Completed kernel 2 & 3 ```