Skip to content 1.4 KiB
Newer Older
## Test Case B

This test case is designed to benchmark TensorFlow with small-to-medium-sized datasets using a large-sized deep neural network (DNN). The image resolution is set at (512, 512) px. The training can be carried out using one or more nodes. The DNN is moderately large (about 64 million parameters). In comparison, the popular `ResNet50` CNN has about 23 million parameters. With a network of this size, using a batch size of 2 or 4 is recommended for most GPU.

The dataset can be downloaded at: (2GB)

If the training is done on a single node, running the following command (after necessary allocation of the compute resources) would be enough:
python -f output_bw_512.hdf5 --arch EfficientNetB7 --epochs 10 --noise 0.3 --batch-size 4
Please replace `output_bw_512.hdf5` with the actual dataset file name, and modify other parameters whenever necessary. `--batch-size` of 8 may be used if the GPU has 32 GB of memory.

If multiple nodes are used, it is necessary to run the code with `mpirun` or `mpiexec`. For example, train the DNN on 2 nodes, each with 4 GPUs:
mpirun -np 8 python -f output_bw_512.hdf5 --arch EfficientNetB7 --epochs 10 --noise 0.3 --batch-size 4

If NVIDIA GPUs are used, `DeepGalaxy` can automatically bind an MPI process to a GPU, so no explicit specification of `CUDA_VISIBLE_DEVICES` is needed.