Skip to content
2017-09-12.rst 5.21 KiB
Newer Older
TelCon 12 of September
======================

Apologies:
----------
 - Jacob Finkenrath (CyI)
 - Ricard Borrell (BSW)
 - Volker Weinberg (LRZ)
 - Dimitris Dellis (GRNET)
 - Valeriu Codreanu (SurfSARA)
 - Charles Moulinec (STFC)

Present:
--------
 - Andrew Emerson (CINECA)
 - Arno Proeme (EPCC)
 - Dimitris Dellis (GRNET)
 - Luigi Iapichino (LRZ)
 - Mariusz Uchronski (WCNS/PSNC)
 - Martti Louhivuori (CSC)
 - Sebastian Lührs (JUELICH)
 - Victor Cameo Ponz (CINES)


Minutes of meeting:
-------------------

Update on machine availability
******************************
 - Atos/KNL available, energy software stack not ready yet
 - E4/GPU should be available by mid September. No news since beginning of September.
 - Maxeler/FPGA, A mistake in the last update: the machine should be available by mid
   October NOT mid-September. This compromise a lot running on this machine and confirm
   that we should focus on the two others

Login procedures
****************
 - Atos/KNL: WIP, almost all BCOs send their applications and account are oppened. (about 48h delay)
 - E4/GPU: I'm waiting for the inputs of Carlo Cavazzoni.
 - Maxeler/FPGA: BCOs will have to register on an online procedure and then send the
   public part of an ssh-key to Dirk Pleiter


Basics for energy mesurement on BULL/Atos KNL machine
*****************************************************

There is 2 kind of energy metrics that can be collected on the Atos machine.

1. Bull Energy Optimiser, is an admin oriented tools that allow to get energy metrics at switch and node
   level. This tool won't be operated by final users and metrics will be available through a
   wrapper not defined yet (slurm accounting, or terminal command). Will give a an overview metric of a job.

2. HDEEVIZ framework: allow metrics at a finer level ie DRAM, CPU and IO. to use it we'll just have a load
   a module corresponding to the mpilibrairie used to compile and add a command at the begining of the mpirun
   line. Then access to the metrics will be done through a Graphana web interface.

Figures that will be included in the final deliverable (performances & energy related)
**************************************************************************************

 - Performances in regards of the peakperformance allocated to the run. ie theorical FLOPS for the allocated
   numbers of GPU and or CPUs
 - overall Power consumed by each testcases at high level ie at node and swith level. This is the best metric
   we can imagine to have in common on all the clusters.

Questions, concerns and report from BCOs
****************************************
Everything about login opening, deadlines, expected work during the next period, others:

   +------------------------+-------------------------------+
   | ALYA                   | KNL form sended, waiting for  |
   |                        | effort confirmation           |
   +------------------------+-------------------------------+
   | Code_Saturne           | Connection should be OK       |
   +------------------------+-------------------------------+
   | CP2K                   |  KNL access OK, WIP building  |
   |                        |  and running                  |
   +------------------------+-------------------------------+
   | GADGET                 |  waiting for knl access       |
   +------------------------+-------------------------------+
   | GPAW                   | KNL access OK                 |
   +------------------------+-------------------------------+
   | GROMACS                | KNL acces OK, compiled,       |
   |                        | begining runs                 |
   +------------------------+-------------------------------+
   | NAMD                   | KNL acces OK, compiled,       |
   |                        | begining runs                 |
   +------------------------+-------------------------------+
   | NEMO                   | Focusing on CP2K for now      |
   +------------------------+-------------------------------+
   | PFARM                  | connection KNL broken         |
   +------------------------+-------------------------------+
   | QCD                    | Made a first test which QCD   |
   |                        | part 1 and I could compiled   |
   |                        | and run it on KNL.            |
   +------------------------+-------------------------------+
   | Quantum Espresso       | access GPU OK: compiled with  |
   |                        | Cuda Fortran. KNL access OK   |
   |                        | and started compilation       |
   +------------------------+-------------------------------+
   | SHOC                   | Waiting for GPU access.       |
   |                        | Planning FPGA/KNL port of 2/3 |
   |                        | important kernels if time     |
   |                        | allows                        |
   +------------------------+-------------------------------+
   | Specfem3D_Globe        | Access to KNL OK. Focusing on |
   |                        | lead                          |
   +------------------------+-------------------------------+

**General questions**
 - interconnect KNL (it is the same on PCP and Frioul):  Infiniband - EDR 4x


AoB - Date of next meeting
**************************
 - Date of the next tecon will be end of october