HEPiX Benchmarking Working Group

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map
Manfred Alef (Karlsruhe Institute of Technology (KIT)), Domenico Giordano (CERN), Michele Michelotto

Meeting Date: 7/4/2017
Attendees:  D. Abdurachmanov, M. Alef,  O. Awile, J-M. Barbet, A. Brosa,  J. Flix, D. Giordano, V. Innocente, M. Michelotto,  F. Pantaleo, A. Perez-Calero, M. Reis, M. Rovere, A. Sciaba, Sverre Jarp, M. Schulz, 

1) News (Domenico) 

         - See notes there. no further comments.

2) DB12 cpp (Domenico)

  • DB12 implemented in C++
    • perf profile shows that main used components are math libs
    • avoid that measurements are affected by the version of the Python interpreter

    • is mostly not affected in performances changing CPU model (IB, HW, BW) and OS

      • DB12 python is affected on the contrary

    • DB12 written in C++ in x10 faster than DB12.py

  • Shown the relative scale factors of DB12 (python and C++) and KV among several CPU models

    • KV_speed trend seems to go in opposite direction. To be verified.

  • Antonio suggests that DB12.cpp gets included in the CMS pilot reports, to compare results respect to DB12 python

    • D.G.:  C++ version was not improved to substitute the python version but only to better understand the behaviour of DB12.py. But clearly experimetns are free to use it

    •  It seems that the repository (gitlab) is not accessible

      • Update: now both the gitlab.cern.ch and github.com repositories are accessible

3) Dissecting Benchmarks with perf (Vincenzo)

  • Effects of benchmarks stressing more the front-end or the back-end of a CPU

    • HEP applications are stall because of memory stall or division (we do many)

    • HS06 uses a lot of memory, much more than Geant4 (and probably python) so if Haswell is much better that IvyB in the front-end component you get an improvement that HS06 will not give

    • avx is a hardware specific fast library. an important side effect is that the VM has to know the real CPU model in order to profit of it

  • Seen differences between running benchmark suite in same order or permuting the sequence

    • for instance this can affect differently the L3 cache
    • NB.: Vincenzo is not proposing the adoption of scimark, but highlights the fact also this simple benchmark suite can give different results if running the sequence in different order

4) CMS report (Pepe)

  • Work ongoing, updates in the next weeks. Pilots are running in production ES, cpu models is included in the report.

 

There are minutes attached to this event. Show them.