Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Victor Cameo Ponz
ueabs
Commits
03db1b42
Commit
03db1b42
authored
Sep 29, 2017
by
Victor
Browse files
NEW mom for telcon on 29 of september
parent
d271aeaf
Changes
3
Hide whitespace changes
Inline
Side-by-side
doc/sphinx/mom_telcon/2017-09-12.rst
0 → 100644
View file @
03db1b42
TelCon 12 of September
======================
Apologies:
----------
- Jacob Finkenrath (CyI)
- Ricard Borrell (BSW)
- Volker Weinberg (LRZ)
- Dimitris Dellis (GRNET)
- Valeriu Codreanu (SurfSARA)
- Charles Moulinec (STFC)
Present:
--------
- Andrew Emerson (CINECA)
- Arno Proeme (EPCC)
- Dimitris Dellis (GRNET)
- Luigi Iapichino (LRZ)
- Mariusz Uchronski (WCNS/PSNC)
- Martti Louhivuori (CSC)
- Sebastian Lührs (JUELICH)
- Victor Cameo Ponz (CINES)
Minutes of meeting:
-------------------
Update on machine availability
******************************
- Atos/KNL available, energy software stack not ready yet
- E4/GPU should be available by mid September. No news since beginning of September.
- Maxeler/FPGA, A mistake in the last update: the machine should be available by mid
October NOT mid-September. This compromise a lot running on this machine and confirm
that we should focus on the two others
Login procedures
****************
- Atos/KNL: WIP, almost all BCOs send their applications and account are oppened. (about 48h delay)
- E4/GPU: I'm waiting for the inputs of Carlo Cavazzoni.
- Maxeler/FPGA: BCOs will have to register on an online procedure and then send the
public part of an ssh-key to Dirk Pleiter
Basics for energy mesurement on BULL/Atos KNL machine
*****************************************************
There is 2 kind of energy metrics that can be collected on the Atos machine.
1. Bull Energy Optimiser, is an admin oriented tools that allow to get energy metrics at switch and node
level. This tool won't be operated by final users and metrics will be available through a
wrapper not defined yet (slurm accounting, or terminal command). Will give a an overview metric of a job.
2. HDEEVIZ framework: allow metrics at a finer level ie DRAM, CPU and IO. to use it we'll just have a load
a module corresponding to the mpilibrairie used to compile and add a command at the begining of the mpirun
line. Then access to the metrics will be done through a Graphana web interface.
Figures that will be included in the final deliverable (performances & energy related)
**************************************************************************************
- Performances in regards of the peakperformance allocated to the run. ie theorical FLOPS for the allocated
numbers of GPU and or CPUs
- overall Power consumed by each testcases at high level ie at node and swith level. This is the best metric
we can imagine to have in common on all the clusters.
Questions, concerns and report from BCOs
****************************************
Everything about login opening, deadlines, expected work during the next period, others:
+------------------------+-------------------------------+
| ALYA | KNL form sended, waiting for |
| | effort confirmation |
+------------------------+-------------------------------+
| Code_Saturne | Connection should be OK |
+------------------------+-------------------------------+
| CP2K | KNL access OK, WIP building |
| | and running |
+------------------------+-------------------------------+
| GADGET | waiting for knl access |
+------------------------+-------------------------------+
| GPAW | KNL access OK |
+------------------------+-------------------------------+
| GROMACS | KNL acces OK, compiled, |
| | begining runs |
+------------------------+-------------------------------+
| NAMD | KNL acces OK, compiled, |
| | begining runs |
+------------------------+-------------------------------+
| NEMO | Focusing on CP2K for now |
+------------------------+-------------------------------+
| PFARM | connection KNL broken |
+------------------------+-------------------------------+
| QCD | Made a first test which QCD |
| | part 1 and I could compiled |
| | and run it on KNL. |
+------------------------+-------------------------------+
| Quantum Espresso | access GPU OK: compiled with |
| | Cuda Fortran. KNL access OK |
| | and started compilation |
+------------------------+-------------------------------+
| SHOC | Waiting for GPU access. |
| | Planning FPGA/KNL port of 2/3 |
| | important kernels if time |
| | allows |
+------------------------+-------------------------------+
| Specfem3D_Globe | Access to KNL OK. Focusing on |
| | lead |
+------------------------+-------------------------------+
**General questions**
- interconnect KNL (it is the same on PCP and Frioul): Infiniband - EDR 4x
AoB - Date of next meeting
**************************
- Date of the next tecon will be end of october
doc/sphinx/mom_telcon/2017-09-29.rst
0 → 100644
View file @
03db1b42
TelCon 28 of September
======================
On the primary PRACE number:
+49 30 2541080 (Passcode: 97919065#)
Apologies:
----------
- Andrew Emerson (CINECA)
- Arno Proeme(EPCC)
- Charles Moulinec (STFC)
- Luigi Iapichino (LRZ)
- Volker Weinberg (LRZ)
Present:
--------
- Martti Louhivuori (CSC)
- Mariusz Uchronski (WCNS/PSNC)
- Jacob Finkenrath (CyI)
- Valeriu Codreanu (SurfSARA)
- Dimitris Dellis (GRNET)
- Victor Cameo Ponz (CINES)
Minutes of meeting:
-------------------
Update on machine availability
******************************
- E4/GPU Rack are physicaly inside CINECA, openning is planned around 13 October
- No change for now for FPGA cluster - available mid October
Login procedures
****************
- E4/GPU: Still no new from Carlo Cavazzoni but Victor Cameo Ponz should see
him in person next week
Basics for energy mesurement on BULL/Atos KNL machine
*****************************************************
Update for BEO, that is now available.
- Generate an rsa key and send the public part to svp@cines.fr.
- You will be able to call BEO through SSH and get basic job energy reports.
- Please consult a more detailed documentation on the machine itself:
``/opt/software/frioul/documentation/beo_usage.txt``
- You will also find complete user guides for HDEEVIZ (vizualisation part not available yet) and
BEO in the same directory
Figures that will be included in the final deliverable (performances & energy related)
**************************************************************************************
- it could be interesting, if possible and meaningfull, to run the same simulation on
different set of nodes to get some kind of scalability on power figures.
Questions, concerns and report from BCOs
****************************************
Everything about login opening, deadlines, expected work during the next period, others:
+------------------------+-----------------------------------------------------------+
| | |
| Code name + 4IP-extension BCO +
| | |
+========================+===========================================================+
| Code_Saturne | Ran and exploit results on KNL/Frioul, that is not lost |
| | since files are shared. Should run directly on PCP since |
| | it should be no differences. Runs will begin on PCP by |
| | 10 days. |
+------------------------+-----------------------------------------------------------+
| CP2K | KNL: built essential CP2K libraries, now building |
| | CP2K itself |
+------------------------+-----------------------------------------------------------+
| GADGET | no updates |
+------------------------+-----------------------------------------------------------+
| GPAW | Login ok, ready to start |
+------------------------+-----------------------------------------------------------+
| GROMACS | First runs has been done, since 13 of September could |
+------------------------+ not submit, but this is now over and machine is +
| NAMD | available again |
+------------------------+-----------------------------------------------------------+
| NEMO | KNL Building netcdf and xios before building NEMO itself |
+------------------------+-----------------------------------------------------------+
| PFARM | WIP on code compilation |
+------------------------+-----------------------------------------------------------+
| QCD | part 1 WIP |
+------------------------+-----------------------------------------------------------+
| Quantum Espresso | nothing recent |
+------------------------+-----------------------------------------------------------+
| SHOC | will detail soon kernels to be ported |
+------------------------+-----------------------------------------------------------+
| Specfem3D_Globe | WIP with compilation, previously used sources modified by |
| | intel but since they didn't released it publicly it wont |
| | be used anymore (or just as point of comparison) |
+------------------------+-----------------------------------------------------------+
**General questions**
AoB - Date of next meeting
****************************
- 12 of October, 11:00 CEST, primary PRACE number
doc/sphinx/mom_telcon/index.rst
0 → 100644
View file @
03db1b42
.. UEABS for accelerators documentation master file, created by
sphinx-quickstart on Wed Jun 7 19:01:00 2017.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Minutes of meeting for 4IP extension TelCon
===========================================
.. toctree::
:maxdepth: 1
:glob:
mom_telcon*
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment