Commit 808cd2b1 authored by Maxwell Cai's avatar Maxwell Cai
Browse files

Update README.md

parent 01f3aaed
......@@ -19,7 +19,7 @@ There are many open-source datasets available for benchmarking TensorFlow, such
## Prerequisites Installation
The prerequsities consists of a list of python packages as shown below. It is recommended to create a python virtual environment (either with `pyenv` or `conda`). The following packages can be installed using the `pip` package management tool:
The prerequsities consists of a list of python packages as shown below. It is recommended to create a python virtual environment (either with `pyenv` or `conda`). In general, the following packages can be installed using the `pip` package management tool:
```
pip install tensorflow
pip install horovod
......@@ -27,7 +27,9 @@ pip install scikit-learn
pip install scikit-image
pip install pandas
```
Note: there is no guarantee of optimal performance when `tensorflow` is installed using `pip`. It is better if `tensorflow` is compiled from source, in which case the compiler will likely be able to take advantage of the advanced instruction sets supported by the processor (e.g., AVX512). An official build instruction can be found at https://www.tensorflow.org/install/source. Sometimes, an HPC center may have a tensorflow module optimized for their hardware, in which case the `pip install tensorflow` line can be replaced with a line like `module load <name_of_the_tensorflow_module>`.
Note: there is no guarantee of optimal performance when TensorFlow is installed using `pip`. It is better if TensorFlow is compiled from source, in which case the compiler will likely be able to take advantage of the advanced instruction sets supported by the processor (e.g., AVX512). An official build instruction can be found at https://www.tensorflow.org/install/source. Sometimes, an HPC center may have a tensorflow module optimized for their hardware, in which case the `pip install tensorflow` line can be replaced with a line like `module load <name_of_the_tensorflow_module>`.
For multi-node training, `MPI` and/or `NCCL` should be installed as well.
## How to benchmark the throughput of a HPC system
......@@ -46,9 +48,9 @@ In the `DeepGalaxy` directory, download the training dataset. Depending on the b
**Step 3**: Run the code on different numbers of workers. For example, the following command executes the code on `np = 4` workers:
```
mpirun -np 4 dg_train.py -f output_bw_512.hdf5 --epochs 20 --noise 0.1 --batch-size 4 --arch EfficientNetB4
mpirun -np 4 python dg_train.py -f output_bw_512.hdf5 --epochs 20 --noise 0.1 --batch-size 4 --arch EfficientNetB4
```
where `output_bw_512.hdf5` is the training dataset downloaded in the previous step. Please change the file name if necessary. One could also change the other parameters, such as `--epochs`, `--batch-size`, and `--arch` according to the size of the benchmark. For example, the `EfficientNetB0` deep neural network is for small HPC systems, `EfficientNetB4` is for medium-size ones, and `EfficientNetB7` is for large systems. Also, shoudl the system memory permits, increasing the `--batch-size` could improve the throughput. If the `--batch-size` parameter is too large, an out-of-memory error could occur.
`output_bw_512.hdf5` is the training dataset downloaded in the previous step. Please change the file name if necessary. One could also change the other parameters, such as `--epochs`, `--batch-size`, and `--arch` according to the size of the benchmark. For example, the `EfficientNetB0` deep neural network is for small HPC systems, `EfficientNetB4` is for medium-size ones, and `EfficientNetB7` is for large systems. Also, shoudl the system memory permits, increasing the `--batch-size` could improve the throughput. If the `--batch-size` parameter is too large, an out-of-memory error could occur.
The benchmark data of the training are written to the file `train_log.txt`.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment