Commit d5c40bfe authored by Valeriu Codreanu's avatar Valeriu Codreanu
Browse files

added v1.2 of UEABS

parent 5b54241d
# Unified European Applications Benchmark Suite, version 1.1
# Unified European Applications Benchmark Suite, version 1.2
The Unified European Application Benchmark Suite (UEABS) is a set of 12 application codes taken from the pre-existing PRACE and DEISA application benchmark suites to form a single suite, with the objective of providing a set of scalable, currently relevant and publically available codes and datasets, of a size which can realistically be run on large systems, and maintained into the future. This work has been undertaken by Task 7.4 "Unified European Applications Benchmark Suite for Tier-0 and Tier-1" in the PRACE Second Implementation Phase (PRACE-2IP) project and will be updated and maintained by subsequent PRACE Implementation Phase projects.
For more details of the codes and datasets, and sample results, please see http://www.prace-ri.eu/IMG/pdf/d7.4_3ip.pdf
Release notes for version 1.2, released on 31st October 2016 as a result of PRACE-4IP activities.
Changes from version 1.1 are as follows:
GENE: new version of code and additional new dataset.
GPAW: new version of code and new dataset.
GROMACS: new version of code and updated dataset.
NAMD: new version of code and minor build and run instructions updates.
NEMO: new version of code and replaced dataset.
Release notes for version 1.1, released on 31st May 2014 as a resut of PRACE-3IP activities.
Changes from version 1.0 are as follows:
......@@ -25,7 +34,7 @@ The codes composing the UEABS are:
- [GROMACS](#gromacs)
- [NAMD](#namd)
- [NEMO](#nemo)
- [NEMO](#qcd)
- [QCD](#qcd)
- [Quantum Espresso](#espresso)
- [SPECFEM3D](#specfem3d)
......
......@@ -2,35 +2,52 @@ Instructions for obtaining GPAW and its test set for PRACE benchmarking
GPAW is licensed under GPL, so there are no license issues
NOTE: This benchmark uses 0.11 version of GPAW. For instructions for installing the
latest version, please visit:
https://wiki.fysik.dtu.dk/gpaw/install.html
Software requirements
=====================
* MPI
* BLAS, LAPACK, Scalapack
* HDF5
* Python (2.x series from 2.4 upwards)
* For very large calculations ( > 4000 CPU cores) it is recommended to
to use the special Python interpreter which reduces the
initialization time related to Python's import mechanism:
https://gitorious.org/scalable-python
* NumPy ( > 1.3)
* Python
* version 2.6-3.5 required
* this benchmark uses version 2.7.9
* NumPy
* this benchmark uses version 1.11.0
* ASE (Atomic Simulation Environment)
* this benchmark uses 3.9.0
* LibXC
* this benchmark uses version 2.0.1
* BLAS and LAPACK libraries
* this benchmark uses Intel MKL from Intel Composer Studio 2015
* MPI library (optional, for increased performance using parallel processes)
* this benchmark uses Intel MPI from Intel Composer Studio 2015
* FFTW (optional, for increased performance)
* this benchmark uses Intel MKL from Intel Composer Studio 2015
* BLACS and ScaLAPACK (optional, for increased performance)
* this benchmark uses Intel MKL from Intel Composer Studio 2015
* HDF5 (optional, library for parallel I/O and for saving files in HDF5 format)
* this benchmark uses 1.8.14
Obtaining the source code
=========================
* This benchmark uses the 3.6.1.3356 version of Atomic Simulation Environment
(ASE), which can be obtained as follows:
* The specific version of GPAW used in this benchmark can be obtained from:
https://gitlab.com/gpaw/gpaw/tags/0.11.0
* Installation instructions can be found at:
svn co -r 3356 https://svn.fysik.dtu.dk/projects/ase/trunk ase
https://wiki.fysik.dtu.dk/gpaw/install.html
* This benchmark uses the 0.9.10710 version of GPAW, which can be
obtained as follows:
* For platform specific instructions, please refer to:
svn co -r 10710 https://svn.fysik.dtu.dk/projects/gpaw/trunk gpaw
https://wiki.fysik.dtu.dk/gpaw/platforms/platforms.html
* Installation instructions for various architectures are given in
https://wiki.fysik.dtu.dk/gpaw/install/platforms_and_architectures.html
Support
=======
* Help regarding the benchmark can be requested from jussi.enkovaara@csc.fi
* Help regarding the benchmark can be requested from adem.tekin@be.itu.edu.tr
This benchmark set contains a short functional test as well as scaling
tests for electronic structure simulation software GPAW. More information on
GPAW can be found at wiki.fysik.dtu.dk/gpaw
Functional test: functional.py
==============================
A calculation for the ground state electronic structure of small Si cluster
followed by linear response time-dependent density-functional theory
calculation. This test works with 8-64 CPU cores.
Medium scaling test: Si_gs.py
=============================
A ground state calculation (few iterations) for spherical Si
This test should scale to ~2000 processor cores in x86 architecture.
Total running time with ~2000 cores is ~7 min. In principle, arbitrary
number of CPU cores can be used, but recommended values are powers of 2.
This test produces a 47 GB output file Si_gs.hdf5 to be used for the
large scaling test Si_lr1.py.
For scalability testing the relevant timer in the text output
'out_Si_gs_pXXXX.txt' (where XXXX is the CPU core count) is 'SCF-cycle'.
The parallel I/O performance (with HDF5) can be benchmarked with the
'IO' timer.
Large scaling test: Si_lr1.py
=============================
Linear response TDDFT calculation for spherical Si cluster
This benchmark should be scalable to ~20 000 CPU cores
in x86 architecture. Number of CPU cores has to be a multiple of 256.
Estimated total running time with 20 000 CPU cores is 5 min.
The benchmark requires as input the ground state wave functions in
the file Si_gs.hdf5 which can be produced by the ground state benchmark
Si_gs.py
The relevant timer for this benchmark is 'Calculate K matrix',
timing information is written to a text output file Si_lr1_pxxxx.txt where
xxxx is the number of CPU cores.
Optional large scaling test: Au38_lr.py
This benchmark set contains scaling tests for electronic structure simulation software GPAW.
More information on GPAW can be found at https://wiki.fysik.dtu.dk/gpaw
Small Scaling Test: carbone_nanotube.py
=======================================
Linear response TDDFT calculation for Au38 cluster surrounded by CH3
ligands. This benchmark should be scalable to ~20 000 CPU cores
in x86 architecture. Number of CPU cores has to be a multiple of 256.
Estimated running with 20 000 CPU cores is 5 min.
The benchmark requires as input the ground state wave functions
which can be produced by input Au38_gs.py (about 5 min calculation with
64 cores).
The relevant timer for this benchmark is 'Calculate K matrix',
timing information is written to a text output file Au38_lr_pxxxx.txt where
xxxx is the number of CPU cores.
A ground state calculation for (6-6-10) carbon nanotube, requiring 30 SCF iterations.
The calculations under ScaLAPACK are parallelized under 4/4/64 partitioning scheme.
This systems scales reasonably up to 512 cores, running to completion under two minutes on a 2015 era x86 architecture cluster.
For scalability testing, the relevant timer in the text output 'out_nanotube_hXXX_kYYY_pZZZ' (where XXX denotes grid spacing, YYY denotes Brillouin-zone sampling and ZZZ denotes number of cores utilized) is 'Total Time'.
Medium Scaling Test: C60_Pb100.py and C60_Pb100_POSCAR
======================================================
A ground state calculation for Fullerene on Pb 100 Surface, requiring ~100 SCF iterations.
In this example, the parameters of the parallelization scheme for ScaLAPACK calculations are chosen automatically (using the keyword 'sl_auto: True').
This systems scales reasonably up to 1024 cores, running to completion under thirteen minutes on a 2015 era x86 architecture cluster.
For scalability testing, the relevant timer in the text output 'out_C60_Pb100_hXXX_kYYY_pZZZ' (where XXX denotes grid spacing, YYY denotes Brillouin-zone sampling and ZZZ denotes number of cores utilized) is 'Total Time'.
How to run
==========
......@@ -57,7 +27,6 @@ How to run
* Benchmarks do not need any special command line options and can be run
just as e.g. :
mpirun -np 64 gpaw-python functional.py
mpirun -np 1024 gpaw-python Si_gs.py
mpirun -np 16384 gpaw-python Si_lr1.py
mpirun -np 256 gpaw-python carbone_nanotube.py
mpirun -np 512 gpaw-python C60_Pb100.py
Complete Build instructions: http://manual.gromacs.org/documentation/
A typical build procedure look like :
tar -zxf gromacs-2016.tar.gz
cd gromacs-2016
mkdir build
cd build
cmake \
-DCMAKE_INSTALL_PREFIX=$HOME/Packages/gromacs/2016 \
-DBUILD_SHARED_LIBS=off \
-DBUILD_TESTING=off \
-DREGRESSIONTEST_DOWNLOAD=OFF \
-DCMAKE_C_COMPILER=`which mpicc` \
-DCMAKE_CXX_COMPILER=`which mpicxx` \
-DGMX_BUILD_OWN_FFTW=on \
-DGMX_SIMD=AVX2_256 \
-DGMX_DOUBLE=off \
-DGMX_EXTERNAL_BLAS=off \
-DGMX_EXTERNAL_LAPACK=off \
-DGMX_FFT_LIBRARY=fftw3 \
-DGMX_GPU=off \
-DGMX_MPI=on \
-DGMX_OPENMP=on \
-DGMX_X11=off \
..
make (or make -j ##)
make install
You probably need to adjust
1. The CMAKE_INSTALL_PREFIX to point to a different path
2. GMX_SIMD : You may completely ommit this if your compile and compute nodes are of the same architecture (for example Haswell).
If they are different you should specify what fits to your compute nodes.
For a complete and up to date list of possible choices refer to the gromacs official build instructions.
There are two data sets in UEABS for Gromacs.
1. ion_channel that use PME for electrostatics, for Tier-1 systems
2. lignocellulose-rf that use Reaction field for electrostatics, for Tier-0 systems. Reference : http://pubs.acs.org/doi/abs/10.1021/bm400442n
The input data file for each benchmark is the corresponding .tpr file produced using
tools from a complete gromacs installation and a series of ascii data files
(atom coords/velocities, forcefield, run control).
If it happens to run the tier-0 case on BG/Q use lignucellulose-rf.BGQ.tpr
instead lignocellulose-rf.tpr. It is the same as lignocellulose-rf.tpr
created on a BG/Q system.
The general way to run gromacs benchmarks is :
WRAPPER WRAPPER_OPTIONS PATH_TO_MDRUN -s CASENAME.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logile
CASENAME is one of ion_channel or lignocellulose-rf
maxh : Terminate after 0.99 times this time (hours) i.e. gracefully terminate after ~30 min
resethwat : Reset Timer counters at half steps. This means that the reported
walltime and performance referes to the last
half steps of sumulation.
noconfout : Do not save output coordinates/velocities at the end.
nsteps : Run this number of steps, no matter what is requested in the input file
logfile : The output filename. If extension .log is ommited
it is automatically appended. Obviously, it should be different
for different runs.
WRAPPER and WRAPPER_OPTIONS depend on system, batch system etc.
Few common pairs are :
CRAY : aprun -n TASKS -N NODES -d THREADSPERTASK
Curie : ccc_mrun with no options - obtained from batch system
Juqueen : runjob --np TASKS --ranks-per-node TASKSPERNOD --exp-env OMP_NUM_THREADS
Slurm : srun with no options, obtained from slurm if the variables below are set.
#SBATCH --nodes=NODES
#SBATCH --ntasks-per-node=TASKSPERNODE
#SBATCH --cpus-per-task=THREADSPERTASK
The best performance is usually obtained using pure MPI i.e. THREADSPERTASK=1.
You can check other hybrid MPI/OMP combinations.
The execution time is reported at the end of logfile : grep Time: logfile | awk -F ' ' '{print $3}'
NOTE : This is the wall time for the last half number of steps.
For sufficiently large nsteps, this is half of the total wall time.
There are two data sets in UEABS for Gromacs.
1. ion_channel that use PME for electrostatics, for Tier-1 systems
2. lignocellulose-rf that use Reaction field for electrostatics, for Tier-0 systems. Reference : http://pubs.acs.org/doi/abs/10.1021/bm400442n
The input data file for each benchmark is the corresponding .tpr file produced using
tools from a complete gromacs installation and a series of ascii data files
(atom coords/velocities, forcefield, run control).
If it happens to run the tier-0 case on BG/Q use lignucellulose-rf.BGQ.tpr
instead lignocellulose-rf.tpr. It is the same as lignocellulose-rf.tpr
created on a BG/Q system.
The general way to run gromacs benchmarks is :
WRAPPER WRAPPER_OPTIONS PATH_TO_GMX mdrun -s CASENAME.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logile
CASENAME is one of ion_channel or lignocellulose-rf
maxh : Terminate after 0.99 times this time (hours) i.e. gracefully terminate after ~30 min
resethwat : Reset Timer counters at half steps. This means that the reported
walltime and performance referes to the last
half steps of sumulation.
noconfout : Do not save output coordinates/velocities at the end.
nsteps : Run this number of steps, no matter what is requested in the input file
logfile : The output filename. If extension .log is ommited
it is automatically appended. Obviously, it should be different
for different runs.
WRAPPER and WRAPPER_OPTIONS depend on system, batch system etc.
Few common pairs are :
CRAY : aprun -n TASKS -N NODES -d THREADSPERTASK
Curie : ccc_mrun with no options - obtained from batch system
Juqueen : runjob --np TASKS --ranks-per-node TASKSPERNOD --exp-env OMP_NUM_THREADS
Slurm : srun with no options, obtained from slurm if the variables below are set.
#SBATCH --nodes=NODES
#SBATCH --ntasks-per-node=TASKSPERNODE
#SBATCH --cpus-per-task=THREADSPERTASK
The best performance is usually obtained using pure MPI i.e. THREADSPERTASK=1.
You can check other hybrid MPI/OMP combinations.
The execution time is reported at the end of logfile : grep Time: logfile | awk -F ' ' '{print $3}'
NOTE : This is the wall time for the last half number of steps.
For sufficiently large nsteps, this is half of the total wall time.
Build instructions for namd.
In benchmarks the memopt version is used with SMP support.
In order to build this version, your MPI need to have level of thread support: MPI_THREAD_FUNNELED
You need a NAMD CVS 2.9 version 2013-02-06 or later.
1. Uncompress/tar the source.
2. cd NAMD_Source_BASE (the directory name depends on how the source obtained,
typically : namd2 or NAMD_CVS_2013-02-06_Source )
3. untar the charm-VERSION.tar that exists. If you obtained the namd source via
cvs, you need to download separately charm.
4. cd to charm-VERSION directory
5. configure and compile charm :
This step is system dependent. Some examples are :
CRAY XE6 : ./build charm++ mpi-crayxe smp --with-production -O -DCMK_OPTIMIZE
CURIE : ./build charm++ mpi-linux-x86_64 smp mpicxx ifort --with-production -O -DCMK_OPTIMIZE
JUQUEEN : ./build charm++ mpi-bluegeneq smp xlc --with-production -O -DCMK_OPTIMIZE
The syntax is : ./build charm++ ARCHITECTURE smp (compilers, optional) --with-production -O -DCMK_OPTIMIZE
You can find a list of supported architectures/compilers in charm-VERSION/src/arch
The smp option is mandatory to build the Hybrid version of namd.
This builds charm++.
6. cd ..
7. Configure NAMD.
This step is system dependent. Some examples are :
CRAY-XE6 ./config CRAY-XT-g++ --charm-base ./charm-6.5.0 --charm-arch mpi-crayxe-smp --with-fftw3 --fftw-prefix $CRAY_FFTW_DIR --without-tcl --with-memopt --charm-opts -verbose
CURIE ./config Linux-x86_64-icc --charm-base ./charm-6.5.0 --charm-arch mpi-linux-x86_64-ifort-smp-mpicxx --with-fftw3 --fftw-prefix PATH_TO_FFTW3_INSTALLATION --without-tcl --with-memopt --charm-opts -verbose --cxx-opts "-O3 -xAVX " --cc-opts "-O3 -xAVX" --cxx icpc --cc icc --cxx-noalias-opts "-fno-alias -ip -fno-rtti -no-vec "
Juqueen: ./config BlueGeneQ-MPI-xlC --charm-base ./charm-6.5.0 --charm-arch mpi-bluegeneq-smp-xlc --with-fftw3 --with-fftw-prefix PATH_TO_FFTW3_INSTALLATION --without-tcl --charm-opts -verbose --with-memopt
You need to specify the fftw3 installation directory. On systems that
use environment modules you need to load the existing fftw3 module
and probably use the provided environment variables - like in CRAY-XE6
example above.
If fftw3 libraries are not installed on your system,
download and install fftw-3.3.3.tar.gz from http://www.fftw.org/.
You may adjust the compilers and compiler flags as the CURIE example.
When config ends prompts to change to a directory and run make.
8. cd to the reported directory and run make
If everything is ok you'll find the executable with name namd2 in this
directory.
Build instructions for namd.
In order to run benchmarks the memopt build with SMP support is mandatory.
NAMD may be compiled in an experimental memory-optimized mode that utilizes a compressed version of the molecular structure and also supports parallel I/O.
In addition to reducing per-node memory requirements, the compressed structure greatly reduces startup times compared to reading a psf file.
In order to build this version, your MPI need to have level of thread support: MPI_THREAD_FUNNELED
You need a NAMD 2.11 version or newer.
1. Uncompress/tar the source.
2. cd NAMD_Source_BASE (the directory name depends on how the source obtained,
typically : namd2 or NAMD_2.11_Source )
3. untar the charm-VERSION.tar that exists. If you obtained the namd source via
cvs, you need to download separately charm.
4. cd to charm-VERSION directory
5. configure and compile charm :
This step is system dependent. Some examples are :
CRAY XE6 : ./build charm++ mpi-crayxe smp --with-production -O -DCMK_OPTIMIZE
CURIE : ./build charm++ mpi-linux-x86_64 smp mpicxx ifort --with-production -O -DCMK_OPTIMIZE
JUQUEEN : ./build charm++ mpi-bluegeneq smp xlc --with-production -O -DCMK_OPTIMIZE
Help : ./build --help to see all available options.
For special notes on various systems, you should look in http://www.ks.uiuc.edu/Research/namd/2.11/notes.html.
The syntax is : ./build charm++ ARCHITECTURE smp (compilers, optional) --with-production -O -DCMK_OPTIMIZE
You can find a list of supported architectures/compilers in charm-VERSION/src/arch
The smp option is mandatory to build the Hybrid version of namd.
This builds charm++.
6. cd ..
7. Configure NAMD.
This step is system dependent. Some examples are :
CRAY-XE6 ./config CRAY-XT-g++ --charm-base ./charm-6.7.0 --charm-arch mpi-crayxe-smp --with-fftw3 --fftw-prefix $CRAY_FFTW_DIR --without-tcl --with-memopt --charm-opts -verbose
CURIE ./config Linux-x86_64-icc --charm-base ./charm-6.7.0 --charm-arch mpi-linux-x86_64-ifort-smp-mpicxx --with-fftw3 --fftw-prefix PATH_TO_FFTW3_INSTALLATION --without-tcl --with-memopt --charm-opts -verbose --cxx-opts "-O3 -xAVX " --cc-opts "-O3 -xAVX" --cxx icpc --cc icc --cxx-noalias-opts "-fno-alias -ip -fno-rtti -no-vec "
Juqueen: ./config BlueGeneQ-MPI-xlC --charm-base ./charm-6.7.0 --charm-arch mpi-bluegeneq-smp-xlc --with-fftw3 --with-fftw-prefix PATH_TO_FFTW3_INSTALLATION --without-tcl --charm-opts -verbose --with-memopt
Help : ./config --help to see all available options.
See in http://www.ks.uiuc.edu/Research/namd/2.11/notes.html for special notes on various systems.
What is absolutely necessary is the option : --with-memopt and an SMP enabled charm++ build.
It is suggested to disable tcl support as it is indicated by the --without-tcl flags, since tcl is not necessary
to run the benchmarks.
You need to specify the fftw3 installation directory. On systems that
use environment modules you need to load the existing fftw3 module
and probably use the provided environment variables - like in CRAY-XE6
example above.
If fftw3 libraries are not installed on your system,
download and install fftw-3.3.5.tar.gz from http://www.fftw.org/.
You may adjust the compilers and compiler flags as the CURIE example.
A typical use of compilers/flags adjustement is for example
to add -xAVX in the CURIE case and keep all the other compiler flags of the architecture the same.
Take care or even just avoid using the --cxx option for NAMD config with no reason,
as this will override the compilation flags from the arch file.
When config ends prompts to change to a directory and run make.
8. cd to the reported directory and run make
If everything is ok you'll find the executable with name namd2 in this
directory.
Run instructions for NAMD.
ntell@iasa.gr
After build of NAMD you have an executable called namd2.
The best performance and scaling of namd is achieved using
hybrid MPI/MT version. On a system with nodes of NC cores per node
use 1 MPI task per node and NC threads per task,
for example on a 32 cores/node system use 1 MPI process,
set OMP_NUM_THREADS or any batch system related variable to 32.
Set a variable, for example MYPPN to NC-1,
for example to 31 for a 32 cores/node system.
You can also try other combinations of TASKSPERNODE/THREADSPERTASK to check.
The control file is stmv.8M.memopt.namd for tier-1 and stmv.28M.memopt.namd
for tier-0 systems.
The general way to run is :
WRAPPER WRAPPER_OPTIONS PATH_TO_namd2 +ppn $MYPPN stmv.8M.memopt.namd > logfile
WRAPPER and WRAPPER_OPTIONS depend on system, batch system etc.
Few common pairs are :
CRAY : aprun -n TASKS -N NODES -d THREADSPERTASK
Curie : ccc_mrun with no options - obtained from batch system
Juqueen : runjob --np TASKS --ranks-per-node TASKSPERNOD --exp-env OMP_NUM_THREADS
Slurm : srun with no options, obtained from slurm if the variables below are set.
#SBATCH --nodes=NODES
#SBATCH --ntasks-per-node=TASKSPERNODE
#SBATCH --cpus-per-task=THREADSPERTASK
The run walltime is reported at the end of logfile : grep WallClock: logfile | awk -F ' ' '{print $2}'
Run instructions for NAMD.
ntell@grnet.gr
After build of NAMD you have an executable called namd2.
The best performance and scaling of namd is achieved using
hybrid MPI/MT version. On a system with nodes of NC cores per node
use 1 MPI task per node and NC threads per task,
for example on a 20 cores/node system use 1 MPI process,
set OMP_NUM_THREADS or any batch system related variable to 20.
Set a variable, for example MYPPN to NC-1,
for example to 19 for a 20 cores/node system.
You can also try other combinations of TASKSPERNODE/THREADSPERTASK to check.
The control file is stmv.8M.memopt.namd for tier-1 and stmv.28M.memopt.namd
for tier-0 systems.
The general way to run is :
WRAPPER WRAPPER_OPTIONS PATH_TO_namd2 +ppn $MYPPN stmv.8M.memopt.namd > logfile
WRAPPER and WRAPPER_OPTIONS depend on system, batch system etc.
Few common pairs are :
CRAY : aprun -n TASKS -N NODES -d THREADSPERTASK
Curie : ccc_mrun with no options - obtained from batch system
Juqueen : runjob --np TASKS --ranks-per-node TASKSPERNOD --exp-env OMP_NUM_THREADS
Slurm : srun with no options, obtained from slurm if the variables below are set.
#SBATCH --nodes=NODES
#SBATCH --ntasks-per-node=TASKSPERNODE
#SBATCH --cpus-per-task=THREADSPERTASK
The run walltime is reported at the end of logfile : grep WallClock: logfile | awk -F ' ' '{print $2}'
NEMO_Build_README
Written in 2013-09-12, as a product of NEMO benchmark in PRACE 2IP-WP7.4.
Written by Soon-Heum "Jeff" Ko at Linkoping University, Sweden (sko@nsc.liu.se).
0. Before start
- Download two tarball files (src and input) from PRACE benchmark site.
- Create a directory, 'ORCA12_PRACE' and untar above-mentioned files under that directory. Then the directory structure would be as
----- ORCA12_PRACE
|
|------ DATA_CONFIG_ORCA12/
|
|------ FORCING/
|
|------ NEMOGCM/
|
|------ README
|
|------ Instruction_for_JuBE.tar.gz
1. Build-up of standalone version
- You can find the easy how-to from ORCA12_PRACE/README file, which is an instruction document written after PRACE 1IP contribution. To repeat the instruction,
1) cd NEMOGCM/ARCH
2) create a arch-COMPUTER.fcm file in NEMOGCM/ARCH corresponding to your needs. You can refer to 'arch-ifort_linux_curie.fcm' which is tuned for CURIE x86_64 system.
3) cd NEMOGCM/CONFIG
4) ./makenemo -n ORCA12.L75-PRACE -m COMPUTER
Then you will have a subdirectory 'ORCA12.L75-PRACE' is created.
2. Build-up under JuBE benchmark framework
- You shall first download JuBE benchmark suite and PRACE benchmark applications from PRACE SVN. Then you will find 'nemo' benchmark under PABS/applications. Because the old nemo benchmark set has been ill-written and there have been changes on NEMO source, we provide the benchmark setup for the current NEMO version in a separate tarball file (Instruction_for_JuBE.tar.gz). You can follow the instruction specified there for installing and running NEMO v3.4. in the JuBE benchmark suite.
\ No newline at end of file
NEMO_Run_README
Written in 2013-09-12, as a product of NEMO benchmark in PRACE 2IP-WP7.4.
Written by Soon-Heum "Jeff" Ko at Linkoping University, Sweden (sko@nsc.liu.se).
0. Before start
- Follow the instriction in 'NEMO_Build_README.txt' so that you have the directory structure as specified, along with compiled binary:
----- ORCA12_PRACE
|
|------ DATA_CONFIG_ORCA12/
|
|------ FORCING/
|
|------ NEMOGCM/
|
|------ README
|
|------ Instruction_for_JuBE.tar.gz
1. Running standalone version
- After compilation, you will have 'ORCA12.L75-PRACE' directory created under NEMOGCM/CONFIG.
1) cd ORCA12.L75-PRACE/EXP00
2) Link to datasets. Perform follows:
$ ln -s ../../../../DATA_CONFIG_ORCA12/* .
$ ln -s ../../../../FORCING/* .
3) Locate 'namelist' and 'namelist_ice' files in this directory and edit them
4) Run it. It does not have any special command line arguments, thus you can simply type 'mpirun opa'.
2. Running under JuBE benchmark framework
- You can prepare your own XML file to complete from compiling to running at the same time. A file 'ORCA_PRACE_CURIE.xml' under Instruction_for_JuBE.tar.gz could be used as an example. One remark for CURIE user: you shall specify your project ID and which type of queues (standard; large; ...) you are to use. That information is found from 'ccc_myproject' command.
\ No newline at end of file
NEMO 3.6, GYRE configuration
============================
juha.lento@csc.fi, 2016-05-16
Build and test documentation for NEMO 3.6 in GYRE
configuration. Example commands are tested in CSC's Cray XC40,
`sisu.csc.fi`.
Download NEMO and XIOS sources
------------------------------
### Register
http://www.nemo-ocean.eu
### Check out NEMO sources
```
svn --username USERNAME --password PASSWORD --no-auth-cache co http://forge.ipsl.jussieu.fr/nemo/svn/branches/2015/nemo_v3_6_STABLE/NEMOGCM
...
Checked out revision 6542.
```
### Check out XIOS2 sources
http://www.nemo-ocean.eu/Using-NEMO/User-Guides/Basics/XIOS-IO-server-installation-and-use
```
svn co -r819 http://forge.ipsl.jussieu.fr/ioserver/svn/XIOS/trunk xios-2.0
```
Build XIOS
----------
### Build environment
Xios requires Netcdf4.
```
module load cray-hdf5-parallel cray-netcdf-hdf5parallel
```
### Build command
http://forge.ipsl.jussieu.fr/ioserver/wiki/documentation
```
cd xios-2.0
./make_xios --job 8 --arch XC30_Cray
```
...need to be rerun without `--job 8` and test suite is broken, but library got built?
Build NEMO 3.6 in GYRE configuration
------------------------------------
### Get a bash helper for editing configuration files
```
source <(curl -s https://raw.githubusercontent.com/jlento/nemo/master/fixfcm.bash)
```
...or if you have a buggy bash 3.2...
```
wget https://raw.githubusercontent.com/jlento/nemo/master/fixfcm.bash; source fixfcm.bash
```
### Edit (create) configuration files
```
cd ../NEMOGCM/CONFIG
fixfcm <