Commit f2563dea authored by Jussi Enkovaara's avatar Jussi Enkovaara
Browse files

Exercises and examples about OpenMP parallelization

parent d2e97df0
......@@ -8,9 +8,43 @@ be built and run as:
- gcc -o exe -fopenmp exercise.c
- gfortran -o exe -fopenmp exercise.c
where gcc/gfortran and -fopenmp should be replaced by proper compiler commands and options if you are not using the GNU compilers.
where gcc/gfortran and -fopenmp should be replaced by proper compiler commands
and options if you are not using the GNU compilers.
For more complex cases a Makefile is provided.
## Exercises
## Examples
\ No newline at end of file
- [Hello world](hello-world) Simplest possible OpenMP program
(C and Fortran versions). Level: **basic**
- [Parallel region and data sharing](data-sharing) The basic data sharing
primitives (C and Fortran versions). Level: **basic**
- [Work sharing for a simple loop](work-sharing) Simple parallelization of
for/do loop (C and Fortran versions). Level: **basic**
- [Vector sum and race condition](race-condition) A basic example of race
condition and various ways for resolving it (C and Fortran versions).
Level: **intemediate**
- [Using OpenMP tasks for dynamic parallelization](tasks) Utilising OpenMP
task construct for more dynamic parallelization (C and Fortran versions).
Level: **intemediate**
## Examples
- [Heat equation](heat-equation) A two dimensional heat equation solver which
is parallelized with OpenMP. (C and Fortran versions).
Level: **intermediate**
## How to contribute
Any contributions (new exercises and examples, bug fixes, improvements etc.) are
warmly welcome. In order to contribute, please follow the standard
Gitlab workflow:
1. Fork the project into your personal space
2. Create a feature branch
3. Work on your contributions
4. Push the commit(s) to your fork
5. Submit a merge request to the master branch
As a quality assurance, the merge request is reviewed by PRACE staff before it is accepted into main branch.
Copyright (C) 2018 CSC - IT Center for Science Ltd.
Licensed under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
Code is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
Copy of the GNU General Public License can be obtained from
<http://www.gnu.org/licenses/>.
## Parallel region and data sharing ##
Take a look at the exercise skeleton in [variables.c](variables.c) (or
[variables.F90](variables.F90) for Fortran). Add an OpenMP parallel
region aroung the block where the variables `Var1` and `Var2` are
printed and manipulated. What results do you get when you define the
variables as **shared**, **private** or **firstprivate**? Explain why
do you get different results.
program exer1
implicit none
integer :: var1, var2
var1 = 1
var2 = 2
!$OMP PARALLEL PRIVATE(VAR1, VAR2)
print *, 'Region 1: var1=', var1, 'var2=', var2
var1 = var1 + 1
var2 = var2 + 1
!$OMP END PARALLEL
print *, 'After region 1: var1=', var1, 'var2=', var2
print *
!$OMP PARALLEL FIRSTPRIVATE(VAR1, VAR2)
print *, 'Region 2: var1=', var1, 'var2=', var2
var1 = var1 + 1
var2 = var2 + 1
!$OMP END PARALLEL
print *, 'After region 2: var1=', var1, 'var2=', var2
print *
!$OMP PARALLEL
print *, 'Region 3: var1=', var1, 'var2=', var2
var1 = var1 + 1
var2 = var2 + 1
!$OMP END PARALLEL
print *, 'After region 3: var1=', var1, 'var2=', var2
print *
end program exer1
#include <stdio.h>
int main(void)
{
int var1 = 1, var2 = 2;
#pragma omp parallel private(var1, var2)
{
printf("Region 1: var1=%i, var2=%i\n", var1, var2);
var1++;
var2++;
}
printf("After region 1: var1=%i, var2=%i\n\n", var1, var2);
#pragma omp parallel firstprivate(var1, var2)
{
printf("Region 2: var1=%i, var2=%i\n", var1, var2);
var1++;
var2++;
}
printf("After region 2: var1=%i, var2=%i\n\n", var1, var2);
#pragma omp parallel /* same as omp parallel shared(var1, var2) */
{
printf("Region 3: var1=%i, var2=%i\n", var1, var2);
/* Note that this introduces the data race condition! */
var1++;
var2++;
}
printf("After region 3: var1=%i, var2=%i\n\n", var1, var2);
return 0;
}
program exer1
implicit none
integer :: var1, var2
var1 = 1
var2 = 2
! TODO:
! Test different data sharing clauses here
print *, 'Region 1: var1=', var1, 'var2=', var2
var1 = var1 + 1
var2 = var2 + 1
!end here :)
print *, 'After region 1: var1=', var1, 'var2=', var2
print *
end program exer1
#include <stdio.h>
int main(void)
{
int var1 = 1, var2 = 2;
/* TODO:
* Test the effect of different data sharing clauses here
*/
{
printf("Region 1: var1=%i, var2=%i\n", var1, var2);
var1++;
var2++;
}
printf("After region 1: var1=%i, var2=%i\n\n", var1, var2);
return 0;
}
Copyright (C) 2018 CSC - IT Center for Science Ltd.
Licensed under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
Code is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
Copy of the GNU General Public License can be obtained from
<http://www.gnu.org/licenses/>.
Two dimensional heat equation
=============================
This folder contains a code which solves two dimensional heat equation
with OpenMP parallelization. A coarse grained parallelization is used, where
the threads are launched in the start of the program and kept alive throughout
the whole execution.
Heat (or diffusion) equation is
<!-- Equation
\frac{\partial u}{\partial t} = \alpha \nabla^2 u
-->
![img](http://quicklatex.com/cache3/d2/ql_b3f6b8bdc3a8862c73c5a97862afb9d2_l3.png)
where **u(x, y, t)** is the temperature field that varies in space and time,
and α is thermal diffusivity constant. The two dimensional Laplacian can be
discretized with finite differences as
<!-- Equation
\begin{align*}
\nabla^2 u &= \frac{u(i-1,j)-2u(i,j)+u(i+1,j)}{(\Delta x)^2} \\
&+ \frac{u(i,j-1)-2u(i,j)+u(i,j+1)}{(\Delta y)^2}
\end{align*}
-->
![img](http://quicklatex.com/cache3/2d/ql_59f49ed64dbbe76704e0679b8ad7c22d_l3.png)
Given an initial condition (u(t=0) = u0) one can follow the time dependence of
the temperature field with explicit time evolution method:
<!-- Equation
u^{m+1}(i,j) = u^m(i,j) + \Delta t \alpha \nabla^2 u^m(i,j)
-->
![img](http://quicklatex.com/cache3/9e/ql_9eb7ce5f3d5eccd6cfc1ff5638bf199e_l3.png)
Note: Algorithm is stable only when
<!-- Equation
\Delta t < \frac{1}{2 \alpha} \frac{(\Delta x \Delta y)^2}{(\Delta x)^2 (\Delta y)^2}
-->
![img](http://quicklatex.com/cache3/d1/ql_0e7107049c9183d11dbb1e81174280d1_l3.png)
The two dimensional grid is decomposed along both dimensions, and the
communication of boundary data is overlapped with computation. Restart files
are written and read with MPI I/O.
Compilation instructions
------------------------
For building and running the example one needs to have the
[libpng](http://www.libpng.org/pub/png/libpng.html) library installed.
Move to proper subfolder (C or Fortran) and modify the top of the **Makefile**
according to your environment (proper compiler commands and compiler flags).
Code can be build simple with **make**
How to run
----------
The code can be run with arbitrary number of OpenMP threads. The default
initial temperature field is a disk in a 2000 x 2000 grid. Initial
temperature field can be read also from a file, the provided **bottle.dat**
illustrates what happens to a cold soda bottle in sauna.
* Running with defaults: OMP_NUM_THREADS=4 ./heat_omp
* Initial field from a file: OMP_NUM_THREADS=4 ./heat_omp bottle.dat
* Initial field from a file, given number of time steps:
OMP_NUM_THREADS=4./heat_omp bottle.dat 1000
* Default pattern with given dimensions and time steps:
OMP_NUM_THREADS=4 ./heat_omp 800 800 1000
The program produces a series of heat_XXXX.png files which show the
time development of the temperature field
ifeq ($(COMP),)
COMP=cray
endif
COMMONDIR=../../../common
ifeq ($(COMP),cray)
CC=cc
CCFLAGS=-O3 -I/appl/opt/libpng/include -I$(COMMONDIR)
LDFLAGS=-L/appl/opt/libpng/lib
LIBS=-lpng -lz
endif
ifeq ($(COMP),gnu)
CC=gcc
CCFLAGS=-fopenmp -O3 -Wall -I$(COMMONDIR)
LDFLAGS=
LIBS=-lpng
endif
EXE=heat_omp
OBJS=core.o setup.o utilities.o io.o main.o
OBJS_PNG=$(COMMONDIR)/pngwriter.o
all: $(EXE)
$(COMMONDIR)/pngwriter.o: $(COMMONDIR)/pngwriter.c $(COMMONDIR)/pngwriter.h
core.o: core.c heat.h
utilities.o: utilities.c heat.h
setup.o: setup.c heat.h
io.o: io.c heat.h
main.o: main.c heat.h
$(OBJS_PNG): C_COMPILER := $(CC)
$(OBJS): C_COMPILER := $(CC)
$(EXE): $(OBJS) $(OBJS_PNG)
$(CC) $(CCFLAGS) $(OBJS) $(OBJS_PNG) -o $@ $(LDFLAGS) $(LIBS)
%.o: %.c
$(C_COMPILER) $(CCFLAGS) -c $< -o $@
.PHONY: clean
clean:
-/bin/rm -f $(EXE) a.out *.o *.png *~
/* Main solver routines for heat equation solver */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "heat.h"
/* Update the temperature values using five-point stencil */
void evolve(field *curr, field *prev, double a, double dt)
{
int i, j;
double dx2, dy2;
/* Determine the temperature field at next time step
* As we have fixed boundary conditions, the outermost gridpoints
* are not updated. */
dx2 = prev->dx * prev->dx;
dy2 = prev->dy * prev->dy;
#pragma omp for private(i, j)
for (i = 1; i < curr->nx + 1; i++) {
for (j = 1; j < curr->ny + 1; j++) {
curr->data[i][j] = prev->data[i][j] + a * dt *
((prev->data[i + 1][j] -
2.0 * prev->data[i][j] +
prev->data[i - 1][j]) / dx2 +
(prev->data[i][j + 1] -
2.0 * prev->data[i][j] +
prev->data[i][j - 1]) / dy2);
}
}
}
#ifndef __HEAT_H__
#define __HEAT_H__
/* Datatype for temperature field */
typedef struct {
/* nx and ny are the true dimensions of the field. The array data
* contains also ghost layers, so it will have dimensions nx+2 x ny+2 */
int nx;
int ny;
double dx;
double dy;
double **data;
} field;
/* We use here fixed grid spacing */
#define DX 0.01
#define DY 0.01
/* Function prototypes */
double **malloc_2d(int nx, int ny);
void free_2d(double **array);
void set_field_dimensions(field *temperature, int nx, int ny);
void initialize(int argc, char *argv[], field *temperature1,
field *temperature2, int *nsteps);
void generate_field(field *temperature);
void evolve(field *curr, field *prev, double a, double dt);
void write_field(field *temperature, int iter);
void read_field(field *temperature1, field *temperature2,
char *filename);
void copy_field(field *temperature1, field *temperature2);
void swap_fields(field *temperature1, field *temperature2);
void finalize(field *temperature1, field *temperature2);
#endif /* __HEAT_H__ */
/* I/O related functions for heat equation solver */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "heat.h"
#include "pngwriter.h"
/* Output routine that prints out a picture of the temperature
* distribution. */
void write_field(field *temperature, int iter)
{
char filename[64];
/* The actual write routine takes only the actual data
* (without ghost layers) so we need array for that. */
int height, width;
double **full_data;
int i;
height = temperature->nx;
width = temperature->ny;
/* Copy the inner data */
full_data = malloc_2d(height, width);
for (i = 0; i < temperature->nx; i++)
memcpy(full_data[i], &temperature->data[i + 1][1],
temperature->ny * sizeof(double));
sprintf(filename, "%s_%04d.png", "heat", iter);
save_png(full_data[0], height, width, filename, 'c');
free_2d(full_data);
}
/* Read the initial temperature distribution from a file and
* initialize the temperature fields temperature1 and
* temperature2 to the same initial state. */
void read_field(field *temperature1, field *temperature2, char *filename)
{
FILE *fp;
int nx, ny, i, j;
int count;
fp = fopen(filename, "r");
/* Read the header */
count = fscanf(fp, "# %d %d \n", &nx, &ny);
if (count < 2) {
fprintf(stderr, "Error while reading the input file!\n");
exit(EXIT_FAILURE);
}
set_field_dimensions(temperature1, nx, ny);
set_field_dimensions(temperature2, nx, ny);
/* Allocate arrays (including ghost layers */
temperature1->data = malloc_2d(nx + 2, ny + 2);
temperature2->data = malloc_2d(nx + 2, ny + 2);
/* Read the actual data */
for (i = 1; i < nx + 1; i++) {
for (j = 1; j < ny + 1; j++) {
fscanf(fp, "%lf", &temperature1->data[i][j]);
}
}
/* Set the boundary values */
for (i = 1; i < nx + 1; i++) {
temperature1->data[i][0] = temperature1->data[i][1];
temperature1->data[i][ny + 1] = temperature1->data[i][ny];
}
for (j = 0; j < ny + 2; j++) {
temperature1->data[0][j] = temperature1->data[1][j];
temperature1->data[nx + 1][j] = temperature1->data[nx][j];
}
copy_field(temperature1, temperature2);
fclose(fp);
}
/* Heat equation solver in 2D. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include "heat.h"
int main(int argc, char **argv)
{
double a = 0.5; //!< Diffusion constant
field current, previous; //!< Current and previous temperature fields
double dt; //!< Time step
int nsteps; //!< Number of time steps
int image_interval = 1000; //!< Image output interval
int iter; //!< Iteration counter
double dx2, dy2; //!< delta x and y squared
clock_t start_clock; //!< Time stamps
#pragma omp parallel private(iter)
{
initialize(argc, argv, &current, &previous, &nsteps);
#pragma omp single
{
/* Output the initial field */
write_field(&current, 0);
/* Largest stable time step */
dx2 = current.dx * current.dx;
dy2 = current.dy * current.dy;
dt = dx2 * dy2 / (2.0 * a * (dx2 + dy2));
/* Get the start time stamp */
start_clock = clock();
}
/* Time evolve */
for (iter = 1; iter < nsteps; iter++) {
evolve(&current, &previous, a, dt);
if (iter % image_interval == 0) {
#pragma omp single
write_field(&current, iter);
}
/* Swap current field so that it will be used
as previous for next iteration step */
#pragma omp single
swap_fields(&current, &previous);
}
} /* End of parallel region */
/* Determine the CPU time used for the iteration */
printf("Iteration took %.3f seconds.\n", (double)(clock() - start_clock) /
(double)CLOCKS_PER_SEC);
printf("Reference value at 5,5: %f\n", previous.data[5][5]);
finalize(&current, &previous);
return 0;
}
/* Setup routines for heat equation solver */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "heat.h"
#include "pngwriter.h"
#define NSTEPS 500 // Default number of iteration steps
/* Initialize the heat equation solver */
void initialize(int argc, char *argv[], field *current,
field *previous, int *nsteps)
{
/*
* Following combinations of command line arguments are possible:
* No arguments: use default field dimensions and number of time steps
* One argument: read initial field from a given file
* Two arguments: initial field from file and number of time steps
* Three arguments: field dimensions (rows,cols) and number of time steps
*/
int rows = 1024; //!< Field dimensions with default values
int cols = 1024;
char input_file[64]; //!< Name of the optional input file
int read_file = 0;
*nsteps = NSTEPS;
#pragma omp single
{
switch (argc) {
case 1:
/* Use default values */
break;
case 2:
/* Read initial field from a file */
strncpy(input_file, argv[1], 64);
read_file = 1;
break;
case 3:
/* Read initial field from a file */
strncpy(input_file, argv[1], 64);
read_file = 1;
/* Number of time steps */
*nsteps = atoi(argv[2]);
break;
case 4:
/* Field dimensions */
rows = atoi(argv[1]);
cols = atoi(argv[2]);
/* Number of time steps */
*nsteps = atoi(argv[3]);