Thematic CERN School of Computing - spring 2021

Name: Thematic CERN School of Computing - spring 2021
Start: 2021-06-14T09:00:00+02:00
End: 2021-06-18T17:00:00+02:00
Location: Online event

14 Jun 2021, 09:00 → 18 Jun 2021, 17:00 Europe/Zurich

Online event

Sebastian Lopienski (CERN), Joelma Tolomeo (CERN), Jarek Polok (CERN)

Description

The 8th Thematic CERN School of Computing (tCSC spring 2021) will take place on June 14-18, 2021 in an online format.

The theme of the School is "Scientific Software for Heterogeneous Architectures" - see the academic programme for more details.

The School is targeted at postgraduate (i.e. minimum of Bachelor degree or equivalent) students, engineers and scientists with a few years' experience in particle physics, in computing, or in related fields. We welcome applications from all countries and nationalities.

Due to the ongoing Covid-19 pandemic, this School is organised in an online format. Nevertheless, we aim at creating the usual rich, interactive learning experience, for which CERN School of Computing (CSC) is known since years.

Important Dates

late March - applications open
Friday April 30 (midnight CEST) - deadline for application
Tuesday May 18 - you will be informed about the outcome of the selection by email
Monday June 14 to Friday June 18 - the school

CERN School of Computing

Computing.School@cern.ch

Surveys

Quiz for Track 2, Lecture 2 "Modern programming languages for HEP" by Sebastien Ponce

Quiz for Track 3, Lecture 1 "Scientific computing on heterogeneous architectures" by Dorothea vom Bruch

Quiz for Track 3, Lecture 2 "Programming for GPUs" by Dorothea vom Bruch

Monday 14 June
- 1
  Opening Session
  
  Speakers: Frederic Hemmer (CERN), Sebastian Lopienski (CERN)
  
  2021-06-14 - tCSC 2021.pdf
  
  2021.06 tCSCspring welcome.pdf
  
  Recording
  - a) Word from the CERN IT Department Head
    
    Speaker: Frederic Hemmer (CERN)
    
    Recording
  - b) Introduction to CERN School of Computing
    
    Speaker: Sebastian Lopienski (CERN)
    
    Recording
- 2
  
  Preparing for the HL-LHC computational challenge
  
  In this talk we will introduce some basic concepts related to HEP data processing and analysis workflows, seeing them in action in the context of LHC experiments. We’ll also talk about the evolution of the LHC accelerator and experiments. We’ll characterise at a high level what are the consequences of those upgrades for the HEP data processing software, in particular in the context of an evolving hardware and computing infrastructure.
  
  Speaker: Danilo Piparo (CERN)
  
  Recording
  
  tCSC2021_Intro.pdf
- 10:30
  
  Coffee break
- 3
  Introduction to efficient computing
  Technologies and Platforms - lecture 1
  - The evolution of computing hardware and what it means in practice
  - The seven dimensions of performance
  - Controlling and benchmarking your computer and software
  - Software that scales with the hardware
  - Advanced performance tuning in hardware
  Speaker: Andrzej Nowak
  
  A.Nowak-01 Introduction to Efficient Computing-tCSC2021.pdf
  
  Mattermost discussion channel
  
  Recording
- 4
  Self-presentation: 1 minute per person
  School participants, lecturers and organizers
  (in alphabetical order):
  - Alfonsi Alice
  - Bachmayer Marie
  - Baptista de Souza Leite Juan
  - Barbetti Matteo
  - Barlou Maria
  - Brunner David
  - Bury Florian
  - Campora Daniel
  - Carrere Matthieu
  - Choi Wonqook
  - Chug Neha
  - Connor Patrick
  - Cristella Leonardo
  - De Simoni Micol
  - Fargier Sylvain
  - Favoni Matteo
  - Ferencek Dinko
  - Galli Massimiliano
  - Garcia Chavez Tonatiuh
  - Gilman Alexander Leon
  - Hedia Sassia
  - Lasaosa Garcia Clara
  - Leon Coello Moises David
  - Lopienski Sebastian
  Continued in the afternoon session (right after the lunch break):
  - Manfreda Alberto
  - Mania Georgiana
  - Martikainen Laura
  - Mishra Saswat
  - Mostafa Jalal
  - Ouvrard Xavier Eric
  - Padulano Vincenzo
  - Piparo Danilo
  - Polok Jarek
  - Ponce Sebastien
  - Popescu Andrei
  - Pournaghi Atousa
  - Rafanoharana Dimbiniaina
  - Reid Tres
  - Shchedrolosiev Mykyta
  - Sobol Bartosz
  - Storetvedt Maksim Melnik
  - Sunneborn Gudnadottir Olga
  - Tolomeo Joelma
  - Triantafyllou Natalia
  - Vage Liv Helen
  - Vnuchenkot Anna
  - vom Bruch Dorothea
  Self-presentation-One slide compilation-tCSC-Spring-2021_14.06.2021 compact.pdf
- 5
  
  Self-presentation: 1 minute per person
  
  (continued from the session before lunch: https://indico.cern.ch/event/1017080/contributions/4291663/)
- 14:30
  
  Coffee break
- 6
  
  Group assignment for Track 1: Technologies and Platforms
  
  The goal of this exercise is to provoke you into thinking about some of the key choices in computing.
  
  The scenario
  Modern scientific experiments are massive producers of data. Imagine that you’re a computing manager for one such experiment, which produces 100 terabits of raw data per second and has no computing infrastructure yet. Your task is to use your current knowledge to conceptualize data processing for your experiment and, in the process, to uncover important choices to make.
  
  The challenge
  Focus on key aspects of compute and software, and less so on networks, accelerators or data flows. What kind of considerations, tradeoffs and assumptions would you have to take into account?
  
  What kind of equipment would you use, where would you put it and why? What kind of software would you run? What do you think would be the rough purchase and maintenance cost and effort? Can you identify gaps in your current knowledge that you would need to fill in?
  
  What we expect
  You're not expected to have all the answers! In many cases already listing the important questions can be helpful. Seasoned professionals can spend even 10 years of their careers making such a plan for a single experiment.
  
  Try to answer the challenge in conceptual terms, and using rough estimates. When faced with unknowns, you can make assumptions – make sure to clearly specify when that’s the case. It’s best if you present your solution on the basis of a 1-slide diagram illustrating key concepts and components, but it’s not a requirement.
  
  Group assignment activity.pdf
  
  Mattermost discussion channel
Tuesday 15 June
- 7
  Writing parallel software
  Parallel and Optimised Scientific Software - lecture 1
  - Amdahl's and Gustafson's laws
  - Asynchronous execution
  - Finding concurrency, task vs. data parallelism
  - Using threading in C++ and Python, comparison with multi-process
  - Resource protection and thread safety
  - Locks, thread local storage, atomic operations
  Speaker: Danilo Piparo (CERN)
  
  Mattermost discussion channel
  
  Recording
  
  writingParallelSoftware.pdf
- 8
  Modern programming languages for HEP
  Parallel and Optimised Scientific Software - lecture 2
  - Why Python and C++ ?
  - Recent evolutions: C++ 11/14/17
  - Modern features of C++ related to performance
  - Templating versus inheritance, pros and cons of virtual inheritance
  - Python 3, and switching from Python 2
  Speaker: Sebastien Ponce (CERN)
  
  Course and exercise sources
  
  Mattermost discussion channel
  
  ModernProgramming-handout.pdf
  
  ModernProgramming-pres.pdf
  
  Recording
- 11:00
  
  Coffee break
- 9
  Optimizing existing large codebase
  Parallel and Optimised Scientific Software - lecture 3
  - Measuring performance, tools and key indicators
  - Improving memory handling
  - The nightmare of thread safety
  - Code modernization and low level optimizations
  - Data structures for efficient computation in modern C++
  Speaker: Sebastien Ponce (CERN)
  
  Course and exercise sources
  
  LargeCodeOptimization-handout.pdf
  
  LargeCodeOptimization-pres.pdf
  
  Mattermost discussion channel
  
  Recording
- 10
  School photo
  We will be taking a "group photo" of the school - a picture of the participants connected to the Zoom room. This group photo, containing a lot of small but recognizable pictures of individual participants, will afterwards be published on the school website (and possibly, in other CERN publications).
  - If you want to be part of the group photo, please enable your camera when we will be taking the photo (technically, screenshots of the Zoom gallery view).
  - If you prefer not to be included in this group photo, please just keep your camera off.
  In any case, your name will not appear in the final edited group photo.
- 11
  
  Parallel and optimised scientific software - exercise introduction
  
  Optimisation of an existing, production grade large codebase
  
  Speaker: Sebastien Ponce (CERN)
  
  exercise.pdf
  
  Mattermost discussion channel
  
  Recording
- 12
  
  Parallel and optimised scientific software - exercise
  
  Optimisation of an existing, production grade large codebase
  
  Speakers: Sebastien Ponce (CERN), Arthur Hennequin (CNRS)
  
  exercise.pdf
  
  Mattermost discussion channel
- 13
  
  Special evening talk: Future of the Universe and of Humanity
  
  Speaker: Ivica Puljak (University of Split)
  
  Mattermost discussion channel
  
  Recording
Wednesday 16 June
- 14
  Data-oriented design
  Technologies and Platforms - lecture 3
  - Hardware vectorization in detail – theory vs. practice
  - Software design for vectorization and smooth data flow
  - How can compilers and other tools help?
  Speaker: Andrzej Nowak
  
  A.Nowak-03 Data Oriented Design-tCSC2021.pdf
  
  Mattermost discussion channel
  
  Recording
- 15
  Practical vectorization
  Parallel and Optimised Scientific Software - lecture 4
  - Measuring vectorization level
  - What to expect from vectorization
  - Preparing code for vectorization
  - Vectorizing techniques in C++: intrinsics, libraries, autovectorization
  Speaker: Sebastien Ponce (CERN)
  
  Course and exercise sources
  
  Mattermost discussion channel
  
  PracticalVectorization-handout.pdf
  
  PracticalVectorization-pres.pdf
  
  Recording
- 11:00
  
  Coffee break
- 16
  Scientific computing on heterogeneous architectures
  Programming for Heterogeneous Architectures - lecture 1
  - Introduction to heterogeneous architectures and the performance challenge
  - From general to specialized: Hardware accelerators and applications
  - Type of workloads ideal for different accelerators
  - Trade-offs between multi-core and many-core architectures
  - Implications of heterogeneous hardware on the design and architecture of scientific software
  - Embarrassingly parallel scientific applications in HPC and CERN
  Speaker: Dorothea vom Bruch (CPPM/CNRS)
  
  Mattermost discussion channel
  
  Recording
  
  vom_Bruch_scientific_computing_heterogeneous_architectures_v3.pdf
- 17
  
  Group assignment for Track 2: Parallel and Optimised Scientific Software
  
  Topic for group 1: Large software systems are more and more difficult to maintain over years. In addition, programming languages evolve and the relevant expertise is lost (e.g. about programming in Fortran). When is the right moment to restart from scratch a large software project? Is it possible at all?
  
  Topic for group 2: How to ensure long term data preservation? Today, we can read the writings of Newton, and redo his computations. However, in 300 years, will someone be able to rerun todays' software? How to make it happen? Is it feasible at all?
  
  Topic for group 3: How to ensure good test coverage of a large code base? How to test software that will run on thousands of machines concurrently?
  
  Topic for group 4: A lot of bad quality code and bugs are introduced in physics software due to lack of knowledge of computing languages by non expert software developers. How can we spread better the computer science knowledge and best practices in large scientific collaborations?
  
  Topic for group 5: Every now and then, new hardware or software appears, with often very promising prospects. However, the risk is that they disappear within a few years (think of object oriented databases, Google glasses etc.). How to take benefit of latest technologies without jeopardising a multi-decade project?
  
  Topic for group 6: What's the impact of hardware evolution and choices on software and programming languages? Is is realistic to have hardware agnostic programming languages?
  
  Topic for group 7: According to Donald Knuth, "Premature optimization is the root of all evil". Have you ever had similar experiences? How to decide when a good moment to do optimisation, and what to optimise?
  
  Mattermost discussion channel
- 15:30
  
  Coffee break
- 18
  Student lightning talks
  
  Speakers: Bartosz Marek Sobol (Jagiellonian University Krakow), David Brunner (DESY), Florian Bury (Catholic University of Louvain), Georgiana Mania (DESY), Micol De Simoni (Sapienza University of Rome)
  
  1 - DeSimoni.pdf
  
  2 - Sobol_tCSC_track_recon_short.pdf
  
  3 - FlorianBury_tCSC.pdf
  
  4 - Mania_tCSC2021.pdf
  
  5 - Brunner_tCSC21_PyTorchCppApi.pdf
  
  Mattermost discussion channel
  
  Recording - Bartosz Marek Sobol - tCSC2021 - Track reconstruction on heterogeneous architectures with SYCL
  
  Recording - Brunner David - tCSC21_PyTorchCppApi.pdf
  
  Recording - Florian Bury - tCSC2021 - Matrix Element Regression with Deep Neural Networks
  
  Recording - Georgiana Mania - tCSC2021 - Exploring Heterogeneous Architectures
  
  Recording - Micol De Simoni - tCSC2021 - FRED: a fast Monte Carlo code on GPU
  
  Student lightning talks.pdf
  - a) FRED: a fast Monte Carlo code on GPU for Treatment Planning Software
    
    In this presentation, I would like to talk about the fast MC, FRED, which was the focus of my PhD and, now, of my postdoc. FRED is a fast MC that runs on GPU and it has been developed for medical applications. I would give shortly the framework where FRED was developed explaining why we need a fast MC in medical applications and I would give some information about its state of arts (performances, what we can track, next goals).
    
    Speaker: Micol De Simoni (Sapienza University of Rome)
    
    Recording
  - b) Track reconstruction on heterogeneous architectures with SYCL
    
    With modern physics experiments comes the need to process more and more data. In this talk, I briefly present my latest work on online data processing in DAQ system of PANDA experiment at FAIR / GSI, Darmstadt, Germany using heterogeneous computing platforms and SYCL programming model, the research’s challenges and goals.
    
    Speaker: Bartosz Sobol (Jagiellonian University Krakow)
    
    Recording
  - c) Matrix Element Regression with Deep Neural Networks -- breaking the CPU barrier
    
    The Matrix Element Method (MEM) is a powerful method to extract information from measured events at collider experiments. Compared to multivariate techniques built on large sets of experimental data, the MEM does not rely on an examples-based learning phase but directly exploits our knowledge of the physics processes. This comes at a price, both in term of complexity and computing time since the required multi-dimensional integral of a rapidly varying function needs to be evaluated for every event and physics process considered. This can be mitigated by optimizing the integration, as is done in the MoMEMta package, but the computing time remains a concern, and often makes the use of the MEM in full-scale analysis unpractical or impossible. We investigate in this paper the use of a Deep Neural Network (DNN) built by regression of the MEM integral as an ansatz for analysis, especially in the search for new physics.
    
    Speaker: Florian Bury (Catholic University of Louvain)
    
    Recording
  - d) Exploring Heterogeneous Architectures in Track Reconstruction Software
    
    Track reconstruction algorithms are computationally intensive due to their combinatorial nature and pose a great challenge for HL-LHC era. The estimated compute time will not fit the budget unless the code becomes more efficient and highly parallel. Exploring heterogeneous architectures is the at core of this change and current R&D efforts (mostly focused on CUDA) show promising results, but the goal hasn't been reached yet. The talk introduces the problem(s), some notable published results and connects this to my PhD research topic.
    
    Speaker: Georgiana Mania (DESY)
    
    Recording
  - e) PyTorch C++ API
    
    The most popular language for machine learning is python. This presentation shows an alternative interface written in C++ provided by PyTorch. Discussed are similarities and differences of the C++/python API of PyTorch and its pro/cons regarding the usage in physics analysis.
    
    Speaker: David Brunner (DESY)
    
    Recording
- 19
  
  Parallel and optimised scientific software - exercise debriefing
  
  Optimisation of an existing, production grade large codebase
  
  Speaker: Sebastien Ponce (CERN)
  
  Mattermost discussion channel
  
  Recording
Thursday 17 June
- 20
  Hardware evolution and heterogeneity
  Technologies and Platforms - lecture 2
  - Accelerators, co-processors, heterogeneity
  - Memory architectures, hardware caching and NUMA
  - Compute devices: CPU, GPU, FPGA, ASIC etc.
  - The role of compilers
  Speaker: Andrzej Nowak
  
  A.Nowak-02 Hardware Evolution and Heterogeneity-tCSC2021.pdf
  
  Mattermost discussion channel
  
  Recording
- 21
  Programming for GPUs
  Programming for Heterogeneous Architectures - lecture 2
  - From SIMD to SPMD, a programming model transition
  - Thread and memory organization
  - Basic building blocks of a GPU program
  - Control flow, synchronization, atomics
  Speaker: Dorothea vom Bruch (CPPM/CNRS)
  
  Mattermost discussion channel
  
  Recording
  
  vom_Bruch_programming_for_GPUs_v2.pdf
- 11:00
  
  Coffee break
- 22
  Performant programming for GPUs
  Programming for Heterogeneous Architectures - lecture 3
  - Data locality, coalesced memory accesses, tiled data processing
  - GPU streams, pipelined memory transfers
  - Under the hood: branchless, warps, masked execution
  - Debugging and profiling a GPU application
  Speaker: Daniel Campora (University of Maastricht)
  
  dcampora_performant_programming_for_gpus.pdf
  
  Mattermost discussion channel
  
  Recording
- 23
  
  Programming for heterogeneous architectures - exercise introduction
  
  Speakers: Daniel Campora (University of Maastricht), Dorothea vom Bruch (CPPM/CNRS)
  
  GPU_exercise_introduction.pdf
  
  Mattermost discussion channel
- 24
  
  Programming for heterogeneous architectures - exercise
  
  Speakers: Daniel Campora (University of Maastricht), Dorothea vom Bruch (CPPM/CNRS)
  
  Mattermost discussion channel
Friday 18 June
- 25
  
  Group assignment for Track 3: Programming for heterogeneous architectures
  
  The scenario
  
  Imagine you have to process 400 terabits of raw data per second at a future HEP experiment. Assume the data arrives in a data center, with the information from all sub-detectors already combined for every event. There is no strict latency requirement, i.e. you can use deep buffers inside the servers to store the data until a decision is taken.
  
  The physics you are interested in requires the most complete knowledge of the event possible, ideally track reconstruction, particle identification, calorimeter reconstruction, particle building, maybe jet reconstruction. And possibly other objects you would like the trigger to reconstruct to make your analysis more sensitive.
  
  The task
  
  Design a trigger system which reduces the rate by at least a factor 1000!
  
  Each group will choose and discuss one of the three following topics:
  
  Topic 1
  
  Can you achieve the data reduction in a single step? What are the advantages / disadvantages of multiple selection steps?
  
  Which computing architecture(s) would you choose for your data center? Would you offload parts of the workload to accelerators? If yes, which ones?
  
  Topic 2
  
  What would be the dataflow of your DAQ system? How would you model the data processing and relations between the different parts of your system?
  
  How would you ensure the pipelining between accelerators and the servers? Are there any bottlenecks to consider? How would you address them?
  
  Topic 3
  
  How does the detector layout influence the data flow (eg. would you rather have a homogeneous detector where all particles pass through the same detectors or one with different
  sub-detectors in different regions of phase space)? Do you have recommendations for the detector design that would allow for a more performant DAQ system, depending on the
  architectures you choose for your system?
  
  Group assignment instructions
  
  Mattermost discussion channel
- 10:30
  
  Coffee break
- 26
  Design patterns and best practices
  Programming for Heterogeneous Architectures - lecture 4
  - Good practices: single precision, floating point rounding, avoid register spilling, prefer single source
  - Other standards: SYCL, HIP, OpenCL
  - Middleware libraries and cross-architecture compatibility
  - Reusable parallel design patterns with real-life applications
  Speaker: Daniel Campora (University of Maastricht)
  
  dcampora_design_patterns_and_best_practices.pdf
  
  Mattermost discussion channel
  
  Recording
- 27
  
  Programming for heterogeneous architectures - exercise debriefing
  
  Speakers: Daniel Campora (University of Maastricht), Dorothea vom Bruch (CPPM/CNRS)
  
  Mattermost discussion channel
- 28
  Summary and future technologies overview
  Technologies and Platforms - lecture 4
  - Teaching program summary and wrap-up
  - Next-generation memory technologies and interconnect
  - Future computing evolution
  Speaker: Andrzej Nowak
  
  A.Nowak-04 Future Directions and Wrap-up-tCSC2021.pdf
  
  Recording
- 15:00
  
  Coffee break
- 29
  
  Exam
  
  Exam.pdf
- 30
  
  Closing Session
  
  Speaker: Sebastian Lopienski (CERN)
  
  2021.06 tCSCspring closing.pdf
  
  2021.06 tCSCspring closing.pptx

Choose timezone

Thematic CERN School of Computing - spring 2021

Online event

Important Dates