inverted CERN School of Computing 2016

Europe/Zurich
31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

31/3-004
105
Show room on map
Alberto Pace (CERN), Catharine Noble (CERN), Nikos Kasioumis (CERN), Sebastian Lopienski (CERN)
Description

CSC2016

"Where students turn into teachers"

Involving former CSC participants to deliver advanced education

Webcast
There is a live webcast for this event
    • 09:10 09:30
      Introduction to the inverted CSC 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Convener: Alberto Pace (CERN)
    • 09:30 10:30
      Continuous Integration : how can it help? 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map

      Software becomes more complex as the project size and number of developers grow. As these two factors increase, so too does the potential for more errors in the code. Previous time and money can be wasted trying to track down bugs that could have been easily avoided should there have been a good workflow in place.

      Continuous Integration (CI) is one such strategy that can dramatically improve software quality. An in-depth look at what CI is, as well as the fundamental concepts will be explored. Various scenarios of how CI can be incorporated into different types of projects will be covered. There are many CI software packages on the market. It's not always easy choosing what CI package is best suited for your project. Some main points to keep in mind when beginning to implement CI into your project will also be discussed.

      Speaker: Joshua Wyatt Smith (Georg-August-Universitaet Goettingen (DE))
    • 10:30 11:00
      Coffee 30m 31/3-009 - IT Amphitheatre Coffee Area

      31/3-009 - IT Amphitheatre Coffee Area

      CERN

      30
      Show room on map
    • 11:00 12:00
      Continuous Delivery and Quality Monitoring 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      We’re all involved in some software/physics projects. As a rule of thumb projects start really simple - a couple of scripts, classes and a few external dependencies. At this phase delivering a release to our clients is simple. We can compile the project locally and deliver compiled sources, for example by e-mail. Unfortunately, in most cases the growth of projects is inevitable. Our simple approaches to build, test and deliver applications are not sufficient. We start to spend more and more time on these ‘administrative’ procedures than on the real developments. As the project grows, our productivity declines and we are less responsive to requests from our clients. In this lecture I will try to present common delivery patterns and tools which facilitate these processes. After introducing Continuous Delivery, I will switch the topic and try to answer the question how much should we invest in quality and how to do it efficiently. My observations reveal that software quality is often considered as the slowing down force. Following this false belief I would like to convince people that software quality can accelerate development within our projects.
      Speaker: Kamil Henryk Krol (CERN)
    • 12:00 13:30
      Lunch 1h 30m Restaurant II

      Restaurant II

    • 13:30 13:45
      A word from the IT Department Head 15m 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Speaker: Frederic Hemmer (CERN)
    • 13:45 14:45
      I - Template Metaprogramming for Massively Parallel Scientific Computing - Expression Templates 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods in a way which keeps the "front end" code very readable. --- LECTURE 1 In this lecture, we will have a quick look at the basics of Template Metaprogramming - how does the computing handle "types", and how these types map more or less naturally to physical quantities. We will then introduce the idea of using Expression Templates (ET) as a means to bride the gap between the low-level high-performance approach, and the object-oriented, readable but often severely under-performing approach. We will study the structure of the ETs, and show the basic steps needed to build them.
      Speaker: Jiří Vyskočil (Czech Technical University in Prague)
    • 14:45 15:45
      I - Multivariate Classification and Machine Learning in HEP 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Traditional multivariate methods for classification (Stochastic Gradient Boosted Decision Trees and Multi-Layer Perceptrons) are explained in theory and practise using examples from HEP. General aspects of multivariate classification are discussed, in particular different regularisation techniques. Afterwards, data-driven techniques are introduced and compared to MC-based methods.
      Speaker: Thomas Keck (KIT)
    • 15:45 16:15
      Coffee 30m 31/3-009 - IT Amphitheatre Coffee Area

      31/3-009 - IT Amphitheatre Coffee Area

      CERN

      30
      Show room on map
    • 16:15 17:15
      Formal verification - Robust and efficient code: Introduction to Formal Verification 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      LECTURE 1: We will establish two general approaches to FV and where they are applicable: model checking and theorem proving. We will explore the latter in more details and have a brief look at the underlying theory, predicate logic. We will see how this family of logic systems can be used to prove abstract properties of our program and why this is useful. Practical examples will be presented and explained. ---- This talk aims to introduce the concepts of Formal Verification and how they can be used to the benefit of the programmer to produce robust and efficient code. We will be looking into the subject at two levels, both and overview of what FV can concretely bring programmers and going into the nitty-gritty details of theorem proving one of the methods use for FV. In general, FV means "proving that certain properties hold for a given system using formal mathematics". This definition can certainly feel daunting, however, as we will learn, we can reap benefits from the paradigm without digging too deep into the subject. Examples where FV can help include proving that your code cannot raise division by zero exceptions; produce optimised byte code where the optimisations are proven to be safe and help reason about concurrent systems.
      Speaker: Kim Albertsson (CERN)
    • 09:00 10:00
      I - Event reconstruction in Modern Particle Physics 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Particle physics experiments have always been at the forefront of big data experiments: the upgrade of the LHCb experiment will lead to data rates greater than 10Tb’s per second! This is key to the success of high-energy physics, where large data samples, sophisticated triggers and robust simulations have lead to observing and understanding extremely rare events, including the Higgs Boson. Continuously, physicists are revisiting computing and electronics decisions to balance the differences between the quality and quantity of physics results, computing effort and available budgets. By drawing on examples of modern particle physics experiments, these lectures will consider the various approaches to tackle such large particle physics data problems: • Data reduction at the hardware level, including triggers. • Principles and optimizations of reconstruction algorithms. • Parallelized reconstruction, including sub processing vs. multithreading. These topics will be introduced in order, beginning with raw data output from these large experiments, passing through the different stages of data reconstruction and reduction, leading to examples of physics results. The lecture series will end with an outlook to future technologies and the associated physics.
      Speaker: Daniel Martin Saunders (University of Bristol (GB))
    • 10:00 11:00
      I - Detector Simulation for the LHC and beyond: how to match computing resources and physics requirements 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Detector simulation at the LHC is one of the most computing intensive activities. In these lectures we will show how physics requirements were met for the LHC experiments and extrapolate to future experiments (FCC-hh case). At the LHC, detectors are complex, very precise and ambitious: this implies modern modelisation tools for geometry and response. Events are busy and characterised by an unprecedented energy scale with hundreds of particles to be traced and high energy showers to be accurately simulated. Furthermore, high luminosities imply many events in a bunch crossing and many bunch crossings to be considered at the same time. In addition, backgrounds not directly correlated to bunch crossings have also to be taken into account. Solutions chosen for ATLAS (a mixture of detailed simulation and fast simulation/parameterisation) will be described and CPU and memory figures will be given. An extrapolation to the FCC-hh case will be tried by taking as example the calorimeter simulation.
      Speaker: Valentina Cairo (Universita della Calabria (IT))
    • 11:00 11:30
      Coffee break 30m
    • 11:30 12:30
      II - Template Metaprogramming for Massively Parallel Scientific Computing - Vectorization with Expression Templates 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods in a way which keeps the "front end" code very readable. --- LECTURE 2 In this lecture, we will have a closer look at the opportunities for implementing SIMD vectorisation through the Expression Template idiom. We will see how it can create a layer of separation between the algorithm, and the low-level implementation. We will use the C++ template mechanisms to accommodate our program so that the algorithm itself doesn't need to explicitly specify SIMD-related types alignment, or operations. We will also explore how our memory data structure layout affects SIMD performance in different workloads, and introduce methods which improve performance in specific cases.
      Speaker: Jiří Vyskočil (Czech Technical University in Prague)
    • 12:30 14:00
      Lunch 1h 30m Restaurant II

      Restaurant II

      CERN

    • 14:00 15:00
      II - Multivariate Classification and Machine Learning in HEP 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      A summary of the history of deep-learning is given and the difference to traditional artificial neural networks is discussed. Advanced methods like convoluted neural networks, recurrent neural networks and unsupervised training are introduced. Interesting examples from this emerging field outside HEP are presented. Possible applications in HEP are discussed.
      Speaker: Thomas Keck (KIT)
    • 15:00 16:00
      Formal Verification - Robust and Efficient code: Why Formal Verification 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map

      LECTURE 2: In this lecture we will expand on the concepts of the previous lecture and establish formal methods in a broader context, ignoring implementation detail, and investigate how and where these methods are used today, and where they might be used tomorrow. As concrete examples we will be studying how FV can benefit static analysis and comp-cert, and verified C compiler.


      This talk aims to introduce the concepts of Formal Verification and how they can be used to the benefit of the programmer to produce robust and efficient code. We will be looking into the subject at two levels, both and overview of what FV can concretely bring programmers and going into the nitty-gritty details of theorem proving one of the methods use for FV.

      In general, FV means "proving that certain properties hold for a given system using formal mathematics". This definition can certainly feel daunting, however, as we will learn, we can reap benefits from the paradigm without digging too deep into the subject.

      Examples where FV can help include proving that your code cannot raise division by zero exceptions; produce optimised byte code where the optimisations are proven to be safe and help reason about concurrent systems.

      Speaker: Kim Albertsson (CERN)
    • 16:00 16:30
      Coffee break 30m
    • 16:30 17:30
      Accelerating C++ applications in Medical Physics 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      The recent developments in multithreading tools in C++, like OpenMP and TBB, taking advantage of the multicore architecture of the nowadays processors, allowed the creation and improvement of powerful softwares for scientific research. This talk will be focused on the development of such software for simulations, data acquisition and image reconstruction in Positron Emission Tomography, one of the most powerful tools for cancer detection.
      Speaker: Pedro Manuel Mendes Correia (University of Aveiro (PT))
    • 09:00 10:00
      II - Event reconstruction in Modern Particle Physics 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Particle physics experiments have always been at the forefront of big data experiments: the upgrade of the LHCb experiment will lead to data rates greater than 10Tb’s per second! This is key to the success of high-energy physics, where large data samples, sophisticated triggers and robust simulations have lead to observing and understanding extremely rare events, including the Higgs Boson. Continuously, physicists are revisiting computing and electronics decisions to balance the differences between the quality and quantity of physics results, computing effort and available budgets. By drawing on examples of modern particle physics experiments, these lectures will consider the various approaches to tackle such large particle physics data problems: • Data reduction at the hardware level, including triggers. • Principles and optimizations of reconstruction algorithms. • Parallelized reconstruction, including sub processing vs. multithreading. These topics will be introduced in order, beginning with raw data output from these large experiments, passing through the different stages of data reconstruction and reduction, leading to examples of physics results. The lecture series will end with an outlook to future technologies and the associated physics.
      Speaker: Daniel Martin Saunders (University of Bristol (GB))
    • 10:00 11:00
      II - Detector simulation for the LHC and beyond : how to match computing resources and physics requirements 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Detector simulation at the LHC is one of the most computing intensive activities. In these lectures we will show how physics requirements were met for the LHC experiments and extrapolate to future experiments (FCC-hh case). At the LHC, detectors are complex, very precise and ambitious: this implies modern modelisation tools for geometry and response. Events are busy and characterised by an unprecedented energy scale with hundreds of particles to be traced and high energy showers to be accurately simulated. Furthermore, high luminosities imply many events in a bunch crossing and many bunch crossings to be considered at the same time. In addition, backgrounds not directly correlated to bunch crossings have also to be taken into account. Solutions chosen for ATLAS (a mixture of detailed simulation and fast simulation/parameterisation) will be described and CPU and memory figures will be given. An extrapolation to the FCC-hh case will be tried by taking as example the calorimeter simulation.
      Speaker: Valentina Cairo (Universita della Calabria (IT))
    • 11:00 11:30
      Coffee break 30m 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
    • 11:30 12:30
      III - Template Metaprogramming for massively parallel scientific computing - Templates for Iteration; Thread-level Parallelism 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods in a way which keeps the "front end" code very readable. --- LECTURE 3 In this lecture, we will look into a specific technique to parallelize a large data-centric workload iterating over a multi-dimensional array. We will show how to separate iteration and computation and how the "front-end" algorithm can then be made independent on the dimensionality, coordinate system, or order of numerical approximation. We will show how this separation further helps to implement thread-level parallelism into the "back-end" and explore some common cases of data dependency. We will finally take a look at an example code combining the ideas of all three lectures.
      Speaker: Jiří Vyskočil (Czech Technical University in Prague)
    • 12:30 14:00
      Lunch 1h 30m 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
    • 14:00 15:00
      Shared memory and message passing revisited in the many-core era 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      In the 70s, Edsgar Dijkstra, Per Brinch Hansen and C.A.R Hoare introduced the fundamental concepts for concurrent computing. It was clear that concrete communication mechanisms were required in order to achieve effective concurrency. Whether you're developing a multithreaded program running on a single node, or a distributed system spanning over hundreds of thousands cores, the choice of the communication mechanism for your system must be done intelligently because of the implicit programmability, performance and scalability trade-offs. With the emergence of many-core computing architectures many assumptions may not be true anymore. In this talk we will try to provide insight on the characteristics of these communication models by providing basic theoretical background and then focus on concrete practical examples based on indicative use case scenarios. The case studies of this presentation cover popular programming models, operating systems and concurrency frameworks in the context of many-core processors.
      Speaker: Aram Santogidis (CERN)
    • 15:00 16:00
      Volatile Environments with Virtualisation Technologies 1h 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Sometimes our job or even our interest to learn something new, requires from us to install a lot of different software to allow a specific program to run on our operating system. This (in the best case) might just prohibit your program to run due to conflicts between different library or language versions; in the worst case your operating system will start becoming full of junk and later on will be slow and insecure. In this course we will explore two relatively new but very well established virtualisation technologies: Vagrant and Docker and how those tools can help us to keep a tidy, exportable and transferable developing environment on our home or work computer.
      Speaker: Anastasios Andronidis (University of Ioannina (GR))
    • 16:00 16:15
      Closing remarks 15m 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      31/3-004
      105
      Show room on map
      Speaker: Alberto Pace (CERN)