Skip to content
atos_knl.rst 3 KiB
Newer Older
Xeon Phi
^^^^^^^^

Victor's avatar
Victor committed
This machine has been designed by `Atos/Bull`_ and is hosted at CINES_ in Montpellier, France. It is made of 76 Bull Sequana X1210 blades, each including 3 Xeon Phi KNL nodes. It totals a theoretical peak performance of 465 Tflop/s with an estimated consumption of 42kW.
Victor's avatar
Victor committed
    In order to access the machine BCOs should fill the `GENCI login opening form`_.
    Use the following information to fill project related fields:

     - project outside DARI
     - name of the personn in charge of the project: Victor Cameo Ponz
     - phone number: +33 (0)4 67 14 14 03
     - project code: praceknl
     - scientific machine demanded: PCP KNL cluster

    Then send it back to `Victor Cameo Ponz`_.

Compute technology
""""""""""""""""""
Hardware features the following nodes:
Victor's avatar
Victor committed
 * 168 nodes with
Victor's avatar
Victor committed

Victor's avatar
Victor committed
   * 1x Intel Xeon Phi 7250 processor (KNL), 68 cores cadenced to 1.4 GHz with SMT 4.
   * 96GB memory, 16GBx6 DDR4 DIMMs
Victor's avatar
Victor committed
 * intranode communications integrated using InfiniBand EDR
Victor's avatar
Victor committed
 * 100% Hot water cooled nodes
 * Half of the configuration feature liquid cooled Power Supply Unit (PSU) make this part of the machine 100% liquid cooled.
Victor's avatar
Victor committed
Each compute node has a theoritical peak performance of 2.765 TFlop/s (double precision) and a power consumption of less than 250W.

Energy sampling technology
""""""""""""""""""""""""""

Victor's avatar
Victor committed
Power measurements at node level occurs at the sampling rate of 1 kHz at converters and 100 Hz at CPU/DRAM. It is provided through a HDEEM FPGA on each node
Victor's avatar
Victor committed

Victor's avatar
Victor committed
`Atos/Bull`_ allow energy access through two frameworks, namely HDEEM VIZualization (HDEEVIZ) and Bull Energy Optimizer (BEO).
Victor's avatar
Victor committed

Victor's avatar
Victor committed
.. note::

    Specific setup documentations and instructions is available on the machine:  :code:`ls /opt/software/frioul/documentation/`.

Victor's avatar
Victor committed

Victor's avatar
Victor committed
HDEEVIZ
-------

Components
 - SLURM synchronisation + initialisation
 - HDEEM writing results to local storage
 - Grafana: Graphical user interface

Here's an example of usage in a submission script:

.. code-block:: shell

Victor's avatar
Victor committed
#SBATCH -N 2
#SBATCH -time 00:30:00
#SBATCH -J Specfem3D_Globe
#SBATCH -n 89
Victor's avatar
Victor committed

Victor's avatar
Victor committed
module load intel/17.2 intelmpi/2018.0.061
module load hdeeviz/hdeeviz_intelmpi_2018.0.061
Victor's avatar
Victor committed

Victor's avatar
Victor committed
hdeeviz mpirun -n 89 $PWD/bin/xspecfem3D
Victor's avatar
Victor committed

Victor's avatar
Victor committed
Access to generated data will be made through the Grafana web interface:

.. image:: /pcp_systems/graphana.png

BEO
---

Victor's avatar
Victor committed
BEO is a system administrator oriented tools that allow to get energy metrics at switch and node level. At user level the main interesting feature is the :code:`get_job_energy slurm<job_id<optionnal: .jobstep>>`. It produces the following output:
Victor's avatar
Victor committed

.. literalinclude:: /pcp_systems/output_beo_report_energy
   :emphasize-lines: 1

Victor's avatar
Victor committed
.. _GENCI login opening form: https://www-dcc.extra.cea.fr/CCFR/
.. _cines-login-form-odt: https://www.cines.fr/wp-content/uploads/2014/01/opening_renewal_login_2017.odt
.. _cines-login-form-rtf: https://www.cines.fr/wp-content/uploads/2014/01/opening_renewal_login_2017.rtf
.. _Atos/Bull: https://bull.com/
.. _CINES: https://www.cines.fr/
.. _Victor Cameo Ponz: cameo+4ip-extension@cines.fr