README.md

## Test Case A

This test case is designed to benchmark TensorFlow with small-to-medium-sized datasets using a medium-sized deep neural network (DNN). The image resolution is set at (512, 512) px. The training can be carried out using one or more nodes. The DNN is relatively small (about 17 million parameters), which would fit into most GPUs when using a `batch_size` of 8 or even 16.

The dataset can be downloaded at: https://surfdrive.surf.nl/files/index.php/s/Mzm28FQ1udG3FG7 (2GB)

If the training is done on a single node, running the following command (after necessary allocation of the compute resources) would be enough:
```
python dg_train.py -f output_bw_512.hdf5 --arch EfficientNetB4 --epochs 10 --noise 0.3 --batch-size 4
```
Please replace `output_bw_512.hdf5` with the actual dataset file name, and modify other parameters whenever necessary. `--batch-size` of 8 may be used if the GPU has 32 GB of memory.

If multiple nodes are used, it is necessary to run the code with `mpirun` or `mpiexec`. For example, train the DNN on 2 nodes, each with 4 GPUs:
```
mpirun -np 8 python dg_train.py -f output_bw_512.hdf5 --arch EfficientNetB4 --epochs 10 --noise 0.3 --batch-size 4
```

If NVIDIA GPUs are used, `DeepGalaxy` can automatically bind an MPI process to a GPU, so no explicit specification of `CUDA_VISIBLE_DEVICES` is needed.