Commit da71c993 authored by Victor's avatar Victor
Browse files

IMPROVE machine section

parent 77d935db
.. _atos_knl:
Xeon Phi
^^^^^^^^
......
.. _e4_gpu:
Power8 + GPU
^^^^^^^^^^^^
This machine has been designed by `E4 computer engineering`_ and is hosted at CINECA_ in Bologna, Italy.
D.A.V.I.D.E has been designed by `E4 computer engineering`_ and is hosted at CINECA_ in Bologna, Italy. It totals a theoritical peak performance of 990 TFlops and an estimated power consumption of less than 90kW. A more detailed description can be found on the `E4 dedicated webpage`_.
.. note:: In order to access the machine BCO should register on the `CINECA user datatabase`_ and ask `Victor Cameo Ponz`_ to be added to the 4IP-extension project.
.. note:: In order to access the machine BCO should send an email to `Victor Cameo Ponz`_ so that.
Compute technology
""""""""""""""""""
Hardware features fat-nodes with the following design:
* 45 nodes with x2 IBM POWER8 processors and x4 NVIDIA P100 GPU
* 45 nodes with
* x2 IBM POWER8+ processors, ie 8x2 cores with Simultaneous Multi-Threading (SMT) 8
* x4 NVIDIA P100 GPU with 16Go High Bandwidth Memory 2 (HBM2)
* intranode comunications integrated using NVLink
* extranode comunications integrated using Infiniband ERD interconnect
* CPU and GPU liquid cooling based on CoolIT_ solution
* extranode comunications integrated using Infiniband ERD interconnect in fat-tree with no oversubscription topology
* CPU and GPU direct hot water (~27°C) cooling, removing 75-80% of the total heat
* remaining heat is air-cooled
Each compute node has a theoritical peak performance of 22 TFLOPS (double precision) and a power consumption of less than 2kW.
Energy sampling technology
""""""""""""""""""""""""""
Information is collected from processors, memory, GPUs and fans exploiting Analig-to-Digital Converter in the embedded SoC. It provides sampling up to 800 kHz lowered to 50kHz on power measuring sensor outputs.
The technology has been developed in collaboration with the University of Bologna which developed the :code:`get_job_energy <job_id>` program. Usage is straight forward and has the following verbose output:
.. literalinclude:: /output_get_job_energy
:emphasize-lines: 1
.. _E4 computer engineering: https://www.e4company.com
.. _E4 dedicated webpage: https://www.e4company.com/en/?id=press&section=1&page=&new=davide_supercomputer
.. _CINECA: http://hpc.cineca.it/
.. _CINECA user datatabase: https://userdb.hpc.cineca.it/
.. _CoolIT: https://www.coolitsystems.com/
.. _Victor Cameo Ponz: cameo+4ip-extension@cines.fr
.. _maxeler_fpga:
FPGA
^^^^
......
$ get_job_energy 12389
Job 12389
- Duration (seconds): 421.0
- Used Node(s): davide20
- Requested CPUs: 16
- Start time: 2017-12-05 17:33:47; End time: 2017-12-05 17:40:48
(Negative values indicate problems in the job info collection - check back in half an hour)
<===============================================================>
Total nodes power consumption "at the plug". Integral of the
power consumed by each node sampled at 800KHz. BBB Measures
Cumulative (all nodes)
- Mean power (W): 536.402900943
- Total energy (J): 225825.621297
<--------------------------------------------------------------->
Node Average
- Mean node power (W): 536.402900943
- Total node energy (J): 225825.621297
<===============================================================>
AMESTER Power Measures of main components. Integral of the
power consumed by each component sampled at 4KHz :
Cumulative (all nodes)
- Mean power (W): 513.785714286
- Total energy (J): 216303.785714
- Mean FANs power (W): 27.0
- Total FANs energy (J): 11367.0
- Mean GPUs power (W): 107.047619048
- Total GPUs energy (J): 45067.0476192
- Mean CPU_0 processors power (W): 78.9761904762
- Total CPU_0 processors energy (J): 33248.9761905
- Mean CPU_1 processors power (W): 118.023809524
- Total CPU_1 processors energy (J): 49688.0238096
- Mean CPU_0 memories power (W): 137.0
- Total CPU_0 memories energy (J): 57677.0
- Mean CPU_1 memories power (W): 137.023809524
- Total CPU_1 memories energy (J): 57687.0238096
- Mean CPU_0 VCS0s VR power (W): 65.2380952381
- Total CPU_0 VCS0s VR energy (J): 27465.2380952
- Mean CPU_1 VCS0s VR power (W): 62.6666666667
- Total CPU_1 VCS0s VR energy (J): 26382.6666667
- Mean CPU_0 VDD0s VR power (W): 13.5952380952
- Total CPU_0 VDD0s VR energy (J): 5723.59523808
- Mean CPU_1 VDD0s VR power (W): 55.3333333333
- Total CPU_1 VDD0s VR energy (J): 23295.3333333
<--------------------------------------------------------------->
Node Average
- Mean node power (W): 513.785714286
- Total node energy (J): 216303.785714
- Mean FAN power (W): 27.0
- Total FAN energy (J): 11367.0
- Mean GPU power (W): 107.047619048
- Total GPU energy (J): 45067.0476192
- Mean CPU_0 processors power (W): 78.9761904762
- Total CPU_0 processors energy (J): 33248.9761905
- Mean CPU_1 processors power (W): 118.023809524
- Total CPU_1 processors energy (J): 49688.0238096
- Mean CPU_0 memories power (W): 137.0
- Total CPU_0 memories energy (J): 57677.0
- Mean CPU_1 memories power (W): 137.023809524
- Total CPU_1 memories energy (J): 57687.0238096
- Mean CPU_0 VCS0 VR power (W): 65.2380952381
- Total CPU_0 VCS0 VR energy (J): 27465.2380952
- Mean CPU_1 VCS0 VR power (W): 62.6666666667
- Total CPU_1 VCS0 VR energy (J): 26382.6666667
- Mean CPU_0 VDD0 VR power (W): 13.5952380952
- Total CPU_0 VDD0 VR energy (J): 5723.59523808
- Mean CPU_1 VDD0 VR power (W): 55.3333333333
- Total CPU_1 VDD0 VR energy (J): 23295.3333333
PCP systems
***********
.. _e4_gpu:
.. include:: /e4_gpu.rst
.. _atos_knl:
.. include:: /atos_knl.rst
.. _maxeler_fpga:
.. include:: /maxeler_fpga.rst
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment