Commit ac7624f3 authored by Thomas Ponweiser's avatar Thomas Ponweiser
Browse files

structured_grids/cellular_automaton: added readme for wireworld C example

parent 67ce0be8
# README - Wireworld Example (C Version)
## Description
For a general description of *Wireworld* and the file format, see the 'README.md' file in the parent directory.
This code sample demonstrates:
* How to use **MPI Cartesian topologies** and associated convenience functions (i.e. `MPI_Cart_create`, `MPI_Dims_create`, `MPI_Cart_get` and `MPI_Cart_rank`)
* How to use **MPI parallel I/O** for collectively reading and writing 2-dimensional array data (i.e. `MPI_File_open`, `MPI_File_set_errhandler`, `MPI_File_set_view`, `MPI_File_read_all` and `MPI_File_write_all`).
* How to implement **halo exchange** (i.e. the exchange of *ghost cell data*) using three different approaches (each with and without overlap of communication and computation):
1. Using **MPI Graph communicator** and **Neighborhood collective operations** (i.e. `MPI_Dist_graph_create_adjacent` and `MPI_[N/In]eighbor_alltoallw`).
1. Using **MPI Point-to-Point** communication (i.e. `MPI_Isend`, `MPI_Irecv` and `MPI_Waitall`).
1. Using **MPI persistent communication requests** (i.e. `MPI_Send_init`, `MPI_Recv_init`, `MPI_Startall` and `MPI_Waitall`).
* How to use the **MPI subarray datatype** (i.e. `MPI_Type_create_subarray`) for I/O and halo exchange.
The code sample is structured as follows:
* `configuration.c`, `configuration.h`: Command line parsing and basic logging facilities.
* `io.c`, `io.h`: Collective I/O of the cellular automaton state.
* `main.c`: The main program
* `broadcast_configuration`: Broadcasting the parsed command line arguments.
* `create_cart_comm`: Creation of the MPI Cartesian communicator.
* `mpitypes.c`, `mpitypes.h`: Creation of MPI datatypes.
* `simulation.c`, `simulation.h`: Demonstration of 6 approaches for implementing a Wireworld cellular automaton.
* `world.c`, `world.h`: Initialization of the cellular automaton and associated MPI objects.
* `world_init`: Domain decomposition and buffer allocation.
* `world_init_io_type`: Creation of the MPI subarray datatype for I/O.
* `world_init_neighborhood`: Identification of neighboring processes, creation of MPI graph communicator.
* `world_init_persistent_requests`: Initialization of persistent requests for halo data exchange.
## Release Date
2016-10-24
## Version History
* 2016-10-24: Initial Release on PRACE CodeVault repository
## Contributors
* Thomas Ponweiser - [thomas.ponweiser@risc-software.at](mailto:thomas.ponweiser@risc-software.at)
## Copyright
This code is available under Apache License, Version 2.0 - see also the license file in the CodeVault root directory.
## Languages
This sample is entirely written in C.
## Parallelisation
This sample uses MPI-3 for parallelization.
## Level of the code sample complexity
Intermediate / Advanced
## Compiling
Follow the compilation instructions given in the main directory of the kernel samples directory (`/hpc_kernel_samples/README.md`).
## Running
Assuming that the input file `primes.wi` is in your current working directory, to run the program you may use something similar to
```
mpirun -n [nprocs] ./5_structured_wireworld_c primes
```
either on the command line or in your batch script. Note that only the input file's basename (omitting the file extension) is passed to the program.
### Command line arguments
* `-v [0-3]`: Specify the verbosity level - 0: OFF; 1: INFO (Default); 2: DEBUG; 3: TRACE
* `--sparse`: Use MPI neighborhood collective operations, a.k.a *sparse colletive operations*, i.e. `MPI_[N/In]eighbor_alltoallw` for halo data exchange (Default).
* `--p2p`: Use MPI Point-to-Point communication for halo data exchange, i.e. `MPI_Isend`, `MPI_Irecv` and `MPI_Waitall`.
* `--persist`: Use MPI persistent requets (created with `MPI_Send_init` and `MPI_Recv_init`) for halo data exchange (using `MPI_Startall` and `MPI_Waitall`).
* `--overlap`: Overlap communication and computation, i.e. compute inner cells while doing halo data exchange (Default).
* `--no-overlap': Do not overlap communication and computation.
* `-x [X]`, `--nprocs-x [X]`: Use `X` processes in x-direction for MPI Cartesian communicator (optional).
* `-y [Y]`, `--nprocs-y [Y]`: Use `Y` processes in y-direction for MPI Cartesian communicator (optional).
* `-i [N]`, `--iterations [N]`: Do `N` iterations, creating `N` output files with the current state of the cellular automaton; Default: 1.
* `-g [G]`, `--generations-per-iteration [G]': Number of generations to simulate between output iterations; Default: 5000.
For large numbers as arguments to the option `-g`, the suffixes 'k' or 'M' may be used. For example, `-g 50k` specifies 50-thousand and `-g 1M` specifies one million generations per output iteration.
### Example
If you run
```
mpirun -n 12 ./5_structured_wireworld_c -i 10 -g 50k -v 2 --nprocs-x 3 primes
```
the output should look similar to
```
Configuration:
* Verbosity level: DEBUG (2)
* Input file: primes.wi
* Transmission mode: Sparse collective - MPI_Dist_graph_create_adjacent / MPI_[N/In]eighbor_alltoallw
* Overlap mode: Overlapping communication and computation
* Grid of processes: 3 x 4
* Number of iterations: 10
* Generations per iteration: 50000
Reading 'primes.wi'...
Read header (8 characters).
Global size: 632 x 958
Creating Cartesian communicator...
INFO: MPI reordered ranks: NO
Creating MPI distributed graph communicator...
Running 10 iterations with 50000 generations per iteration.
Generation 50000 - written 'primes+000050000.wi'.
Generation 100000 - written 'primes+000100000.wi'.
Generation 150000 - written 'primes+000150000.wi'.
Generation 200000 - written 'primes+000200000.wi'.
Generation 250000 - written 'primes+000250000.wi'.
Generation 300000 - written 'primes+000300000.wi'.
Generation 350000 - written 'primes+000350000.wi'.
Generation 400000 - written 'primes+000400000.wi'.
Generation 450000 - written 'primes+000450000.wi'.
Generation 500000 - written 'primes+000500000.wi'.
Statistics:
* Generations per second: 2750
* Net simulation time (s): 181.787989
* Net I/O time (s): 0.034390
* Total time (s): 181.822473
Done.
```
......@@ -172,9 +172,9 @@ long parse_long(const char *str)
char* p;
long result = strtol(str, &p, 10);
if(*p == 'k') {
result <<= 10;
result *= 1000;
} else if(*p == 'M') {
result <<= 20;
result *= 1000000;
}
return result;
}
......
This diff is collapsed.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment