From fd657bac53ea3413a721ac2b9246a60a988c8119 Mon Sep 17 00:00:00 2001 From: Thomas Ponweiser <thomas.ponweiser@risc-software.at> Date: Fri, 22 Apr 2016 01:37:07 +0200 Subject: [PATCH] updated READMEs --- read2shmem/README.md | 105 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 read2shmem/README.md diff --git a/read2shmem/README.md b/read2shmem/README.md new file mode 100644 index 0000000..e670dea --- /dev/null +++ b/read2shmem/README.md @@ -0,0 +1,105 @@ +# README - "read2shmem": Collective read to shared memory + + +## Description + +This example deals with a situation occurring in many HPC applications, when **large read-only data structures** which are **shared among all processes** need to be managed. Examples of such data structures include large lookup-tables (e.g. of matrices, as occurring in some molecular dynamics applications) or mesh data structures (when employing a parallelization strategy different to domain decomposition), etc. + +For achieving best performance, it is common to replicate such shared read-only data structures on every MPI process. For a purely MPI-parallelized application, such an approach soon becomes infeasible due to the limited memory per core. Therefore, very often threading (e.g. via OpenMP) is used in order to overcome this limitation. However, introducing a second level of parallelism adds complexity to the code and very often introduces overheads and performance penalties (compared to the purely MPI-parallelized variant of the code). **An alternative to threading** in such scenarios is using **shared memory regions.** + +This example demonstrates: + + * how to **read a large data structure** from disk using **collective MPI I/O** and + * how to **save memory** by using **MPI shared memory windows**. + +This example code performs the following steps: + + 1. Use `MPI_Comm_split_type()` for creating one *intranode* sub-communicator for each node, containing all processes living on that node. + 2. Use `MPI_Comm_split()` for creating an *internode* sub-communicator, containing exactly one process from each node. + 3. Use `MPI_File_open()` for collectively opening the input file, using the *internode* communicator. + 4. Use `MPI_File_get_size()` and `MPI_Win_allocate_shared()` for allocating a shared memory window on each node, large enough to hold the file contents. + 5. Use `MPI_File_read_all()` to collectively read the file contents to the shared memory windows. + +## Release Date + +2016-04-22 + +## Version History + + * 2016-04-22: Initial Release on PRACE CodeVault repository + +## Contributors + + * Thomas Ponweiser - [thomas.ponweiser@risc-software.at](mailto:thomas.ponweiser@risc-software.at) + + +## Copyright + +This code is available under Apache License, Version 2.0 - see also the license file in the CodeVault root directory. + + +## Languages + +This sample is entirely written in C. + + +## Parallelisation + +This sample uses MPI-3 for parallelisation. + + +## Level of the code sample complexity + +Intermediate / Advanced + + +## Compiling + +Follow the compilation instructions given in the main directory of the kernel samples directory (`/hpc_kernel_samples/README.md`). + +## Running + +To run the program, use something similar to: + + mpirun -n [nprocs] ./8_io_read2shmem [inputfile] + +Note that `inputfile` is simply treated as stream of integers and thus may contain arbitrary content. In particular you may for example use a file `in.dat` with 512 MB of random content, generated with + + dd if=/dev/urandom of=in.dat bs=1M count=512 + +### Example + +If you run + + mpirun -n 8 ./8_io_read2shmem in.dat + +the output should look similar to + + 000: Size of MPI_COMM_WORLD: 8 + + Broadcasting filename... + + Creating MPI communicators... + + 000: Size of intranode_comm: 8 + + 000: Size of internode_comm: 1 + + Opening file /home/tponweis/tmp/512M... + + Creating shared memory region (512 MB)... + + Reading file... + 000: Read 134217728 integers in 0.661318 seconds (774.211703 MB/s) + 000: Checksum: 1190837419 + + Cleaning up... + 004: Checksum: 1190837419 + 006: Checksum: 1190837419 + 002: Checksum: 1190837419 + 005: Checksum: 1190837419 + 007: Checksum: 1190837419 + 001: Checksum: 1190837419 + 003: Checksum: 1190837419 + + Done. -- GitLab