# README - Wireworld Example (C Version) ## Description For a general description of *Wireworld* and the file format, see the `README.md` file in the parent directory. This code sample demonstrates: * How to use **MPI Cartesian topologies** and associated convenience functions (i.e. `MPI_Cart_create`, `MPI_Dims_create`, `MPI_Cart_get` and `MPI_Cart_rank`) * How to use **MPI parallel I/O** for collectively reading and writing 2-dimensional array data (i.e. `MPI_File_open`, `MPI_File_set_errhandler`, `MPI_File_set_view`, `MPI_File_read_all` and `MPI_File_write_all`). * How to implement **halo exchange** (i.e. the exchange of *ghost cell data*) using three different approaches (each with and without overlap of communication and computation): 1. Using **MPI Graph communicator** and **Neighborhood collective operations** (i.e. `MPI_Dist_graph_create_adjacent` and `MPI_[N/In]eighbor_alltoallw`). 1. Using **MPI Point-to-Point** communication (i.e. `MPI_Isend`, `MPI_Irecv` and `MPI_Waitall`). 1. Using **MPI persistent communication requests** (i.e. `MPI_Send_init`, `MPI_Recv_init`, `MPI_Startall` and `MPI_Waitall`). * How to use the **MPI subarray datatype** (i.e. `MPI_Type_create_subarray`) for I/O and halo exchange. The code sample is structured as follows: * `configuration.c`, `configuration.h`: Command line parsing and basic logging facilities. * `io.c`, `io.h`: Collective I/O of the cellular automaton state. * `main.c`: The main program * `mpitypes.c`, `mpitypes.h`: Creation of MPI datatypes. * `simulation.c`, `simulation.h`: Demonstration of 6 approaches for implementing a Wireworld cellular automaton. * `world.c`, `world.h`: Initialization of the cellular automaton and associated MPI objects. * `create_cart_comm`: Creation of the MPI Cartesian communicator. * `world_init`: Domain decomposition and buffer allocation. * `world_init_io_type`: Creation of the MPI subarray datatype for I/O. * `world_init_neighborhood`: Identification of neighboring processes, creation of MPI graph communicator and MPI Datatypes for halo exchange. * `world_init_persistent_requests`: Initialization of persistent requests for halo data exchange. ## Release Date 2016-10-24 ## Version History * 2016-10-24: Initial Release on PRACE CodeVault repository ## Contributors * Thomas Ponweiser - [thomas.ponweiser@risc-software.at](mailto:thomas.ponweiser@risc-software.at) ## Copyright This code is available under Apache License, Version 2.0 - see also the license file in the CodeVault root directory. ## Languages This sample is entirely written in C. ## Parallelisation This sample uses MPI-3 for parallelization. ## Level of the code sample complexity Intermediate / Advanced ## Compiling Follow the compilation instructions given in the main directory of the kernel samples directory (`/hpc_kernel_samples/README.md`). ## Running Assuming that the input file `primes.wi` is in your current working directory, to run the program you may use something similar to ``` mpirun -n [nprocs] ./5_structured_wireworld_c primes.wi ``` either on the command line or in your batch script. Note that only the input file's basename (omitting the file extension) is passed to the program. ### Command line arguments * `-v [0-3]`: Specify the verbosity level - 0: OFF; 1: INFO (Default); 2: DEBUG; 3: TRACE * `--sparse`: Use MPI neighborhood collective operations, a.k.a *sparse colletive operations*, i.e. `MPI_[N/In]eighbor_alltoallw` for halo data exchange (Default). * `--p2p`: Use MPI Point-to-Point communication for halo data exchange, i.e. `MPI_Isend`, `MPI_Irecv` and `MPI_Waitall`. * `--persist`: Use MPI persistent requets (created with `MPI_Send_init` and `MPI_Recv_init`) for halo data exchange (using `MPI_Startall` and `MPI_Waitall`). * `--overlap`: Overlap communication and computation, i.e. compute inner cells while doing halo data exchange (Default). * `--no-overlap`: Do not overlap communication and computation. * `-x [X]`, `--nprocs-x [X]`: Use `X` processes in x-direction for MPI Cartesian communicator (optional). * `-y [Y]`, `--nprocs-y [Y]`: Use `Y` processes in y-direction for MPI Cartesian communicator (optional). * `-i [N]`, `--iterations [N]`: Do `N` iterations, creating `N` output files with the current state of the cellular automaton; Default: 1. * `-g [G]`, `--generations-per-iteration [G]`: Number of generations to simulate between output iterations; Default: 5000. For large numbers as arguments to the option `-g`, the suffixes 'k' or 'M' may be used. For example, `-g 50k` specifies 50-thousand and `-g 1M` specifies one million generations per output iteration. ### Example If you run ``` mpirun -n 12 ./5_structured_wireworld_c -i 10 -g 50k -v 2 --nprocs-x 3 primes.wi ``` the output should look similar to ``` Configuration: * Verbosity level: DEBUG (2) * Input file: primes.wi * Transmission mode: Sparse collective - MPI_Dist_graph_create_adjacent / MPI_[N/In]eighbor_alltoallw * Overlap mode: Overlapping communication and computation * Grid of processes: 3 x 4 * Number of iterations: 10 * Generations per iteration: 50000 Reading 'primes.wi'... Read header (8 characters). Global size: 632 x 958 Creating Cartesian communicator... INFO: MPI reordered ranks: NO Creating MPI distributed graph communicator... Running 10 iterations with 50000 generations per iteration. Generation 50000 - written 'primes+000050000.wi'. Generation 100000 - written 'primes+000100000.wi'. Generation 150000 - written 'primes+000150000.wi'. Generation 200000 - written 'primes+000200000.wi'. Generation 250000 - written 'primes+000250000.wi'. Generation 300000 - written 'primes+000300000.wi'. Generation 350000 - written 'primes+000350000.wi'. Generation 400000 - written 'primes+000400000.wi'. Generation 450000 - written 'primes+000450000.wi'. Generation 500000 - written 'primes+000500000.wi'. Statistics: * Generations per second: 2750 * Net simulation time (s): 181.787989 * Net I/O time (s): 0.034390 * Total time (s): 181.822473 Done. ```