HEPiX Benchmarking Working Group

Europe/Zurich
NB: No physical room for this meeting
Manfred Alef (Karlsruhe Institute of Technology (KIT)), Domenico Giordano (CERN), Michele Michelotto
    • 14:00 14:10
      News 10m
      Speakers: Domenico Giordano (CERN), Manfred Alef (Karlsruhe Institute of Technology (KIT)), Michele Michelotto
      • WLCG& HSF Workshop 2018 https://indico.cern.ch/event/658060/
        • 26-29 March 2018 Napoli, Italy
           
      • Spectre/Meltdown updates (from Vincent Brillault - IT security team)
        • Intel Microcode updates
          • Intel identified root cause for reboot
            • Only for Broadwell & Haswell
            • No news for Ivy Bridge, Sandy Bridge, Skylake and Kaby Lake…
          • Intel new recommendations:
            • stop deployment of current versions
              • as they may introduce higher than expected reboots and other unpredictable system behavior
          • [Industry] test new microcode version 
        • Spectre v2: IBRS vs Retpoline 
          • IBRS/IBPB (Intel proposal):
            • Requires new microcode (new MSR capabilities)
            • Merged by RedHat in their latest kernel
          • “retpoline” (Google proposal):
            • Software-based mitigation for Spectre v2
            • New compiler feature + kernel patch (+software)
            • Issues with Skylake (improvements pending)
            • Preferred by Linux upstream (already merged)
          • Unclear what RedHat will do…
        • All in all: there is no news of a stable solution, we need to wait
    • 14:10 14:30
      Meltdown-Spectre: updates on performance measurements 20m
      • Feedback from ATLAS 10m
        Speakers: Alaettin Serhan Mete (University of California Irvine (US)), James Catmore (University of Oslo (NO))
      • Feedback from CMS 10m
        Speaker: Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)
    • 14:30 14:50
      pre-GDB on Benchmarking: planning 20m
      Speaker: Domenico Giordano (CERN)

      pre-GDB:

      • Currently planned for April 2018 (2018-04-10)
        • Summary talk to GDB the day after (2018-04-11)

      List of proposed subjects, to be discussed today, in order to prepare the agenda

      Comments and proposals for other subjects or sub-topics are welcome.  

      Identify contributors to cover each topics are 

      1. Identification and features of HEP reference workloads
        1. work strictly related to the System performance modelling WG 
           
      2. - studies of SPEC2017
        1. several presentations, potentially from independent studies
        2. comparison with HEP job mixes.
           
      3. - effects of Meltdown/Spectre
        1. as measured in dedicated HW as well as on production clusters after few months that patches have been deployed
           
      4. Performance evaluation in bare metal servers Vs cloud environments (on premises and/or commercial clouds)
         
      5. - Benchmarking of GPU (GPU adoption is acquiring momentum!) 
        1. workloads experiments plan to run on GPUs (ML, tracking, etc)
        2. benchmarks for GPUs (and their correlation with 5.1) 
           
      6. - Benchmarking in HPC environment
        1. when opportunistic usage for WLCG
        2. when running dedicated HPC workloads