CERN School of Computing on IT Services 2024

Europe/Zurich
11 Av. des Sablonnières, 01210 Ferney-Voltaire, France
Alberto Pace (CERN), Andrzej Nowicki (CERN), Kristina Gunne (CERN)
Description

The 1st CERN School of Computing on IT Services will take place on November 4-8, 2024 in Ferney Voltaire, France. The school will be hosted at the Appart City hotel. 

The school on IT Services aims to empower CERN members of personnel (including Students, Fellows, Origin, Quests, Staff, and Users) to get the most out of the computing services delivered by the CERN IT Department to the physics community.

The school is recommended for any person that is using the CERN IT services either to deliver information, analyze data, automate tasks or work in engineering projects. We expect participation from both active users who would like to be more proficient and newcomers at CERN that would like to discover and get an introduction to the ecosystem of computing services available to all CERN users. The school is not open to persons without a CERN status.

The school will focus on solutions for everyday problems and it will address several use cases with hands-on sessions and demonstrations such as:

  • Software development and hosting, including software maintenance strategies
  • Reproducible data analysis and reusable containers for scientific workflows
  • Using the Data Center infrastructure for Machine Learning applications
  • Information and data management, including information publishing through the creation and maintenance of web sites

Several IT services are covered, including Git, Jira, MS Forms, Dockers, Continuous Integration, CERN SSO (Single Sign-on),  tokens, Authorization API, e-groups / roles, Database on Demand Service, Swan, Reana, WordPress, WebEOS, SharePoint, EOS, CERNBOX, CEPH, …

In addition the school has the ambition to improve the participants personal networking with members of the IT department and exchange ideas on the future evolution of its services as the lecturers are top technology experts or service managers from the IT department.

This is the first edition of this new school which is why the number of participants will be limited to approximately 30 students. The programme offers 26 hours of lectures and hands-on exercises, as well as student presentation sessions. The school is non-residential, lunches and one dinner will be provided and are included in the fee.

 

Important dates 2024

  • August 14 - application opens
  • September 18 - application closes
  • October 2 - invitations sent to selected students
  • October 16th - participation fee deadline

 

CERN School of ComputingContact
    • 08:30
      Welcome coffee
    • 1
      Welcome address from the IT department head
      Speaker: Enrica Maria Porcari (CERN)
    • 2
    • 3
      Opening Lecture: The need for IT Service in accelerator and particle physics

      Starting from accelerator and particle physics, we'll try to see what are the needs of the experiments and accelerator people in terms of IT services.

      Speaker: Sebastien Ponce (CERN)
    • 11:00
      Break
    • 4
      Student self presentation
    • 12:30
      Lunch
    • 5
      Storage (part 1 of 2)

      This two-part lecture series provides an overview of the various storage
      services at CERN. We will look into the motivation behind our large scale
      storage systems, cover some fundamentals and the design principles of the
      storage systems we've developed and use. We will also look into some practical
      use cases covering the ecosystem of the many storage systems we run in the IT
      department. This should serve as a basis for choosing the correct storage
      services for the applications you would develop and practical considerations
      into utilizing storage effectively.

      Speaker: Abhishek Lekshmanan (CERN)
    • 6
      Creation and maintenance of a website

      Overview of the Web Services Portal, and how it can be used for website creation and management at CERN. Website hosting and management services - Drupal/WordPress, WebEOS, GitLab pages etc - will be highlighted and their specific use cases outlined. The use of Matomo for web analytics will be demonstrated.

      Speaker: Vasvi Sharma
    • 15:30
      Break
    • 7
      Storage (part 2 of 2)
      Speaker: Abhishek Lekshmanan (CERN)
    • 8
      Database Services (part 1 of 4) - Introduction to DBoD
      • Short intro about databases
      • Presentation of the DBoD service
      • How to create a database in DBoD
      • How to connect to my DBoD
      Speaker: Andrzej Nowicki (CERN)
    • 9
      Modern Application Development & Deployment (Part 1 of 2)

      “I need to develop an application X for user community Y, which will need to be run and maintained over time”

      1. Application development
        a. This part of the session will explore the multiple types of applications, how to leverage version control system Gitlab and it's CI to have modern application deployment.
        This is organized as a workshop and will include a hands-on experience covering the best practices to develop containerized applications and strategies for deploying them.

      Participants will begin by exploring various application types and learning how to leverage GitLab's version control and CI pipelines for efficient deployment. Through hands-on exercises, attendees will develop a simple application, while learning key concepts such as:
      - Best practices for developing containerized applications.
      - Writing DockerFiles for application deployment.
      - Setting up continuous integration (CI) workflows to automate testing and build and publish Docker images.

      Speaker: Francisco Borges Aurindo Barros (CERN)
    • 10
      Modern Application Development & Deployment (part 2 of 2)

      “I need to develop an application X for user community Y, which will need to be run and maintained over time”

      1. Application development
        a. This part of the session will explore the multiple types of applications, how to leverage version control system Gitlab and it's CI to have modern application deployment.
        This is organized as a workshop and will include a hands-on experience covering the best practices to develop containerized applications and strategies for deploying them.
      Speaker: Francisco Borges Aurindo Barros (CERN)
    • 11:00
      Break
    • 11
      Project Management and documentation

      In this session we will explore the solutions for project management and effective software development.

      We'll start by demonstrating how to plan and track project progress using Jira, available at https://its.cern.ch.

      Next, we'll highlight how GitLab Pages can be utilized to store and share information, whether for internal use or a broader audience, showcasing instances of technical documentation delivered to end-users.

      Finally, we'll explore Confluence for documentation storage, providing a walkthrough of its features and real-world examples of its use.

      Speaker: Francisco Borges Aurindo Barros (CERN)
    • 12:30
      Lunch
    • 12
      Core compute services (part 1 of 4)

      An in depth set of use case where IT services are heavily used for physics, analysis and engineering applications.

      In part I of the series we will show use cases for Openstack, Linux and virtual-machine based configuration management.

      Speaker: Giacomo Tenaglia (CERN)
    • 14:30
      Transport to UN
    • 15:30
      Social Activity - Visit to the United Nations in Geneva
    • 18:30
      Social dinner
    • 13
      Database Services (part 2 of 4) - DBoD maintenance exercises
      • What are the typical tasks to be performed as DBoD owner
        Exercises on:
      • Cloning mechanism
      • Upgrades
      • TLS certificates
      Speaker: Andrzej Nowicki (CERN)
    • 14
      Core compute services (part 2 of 4)

      An in depth set of use case where IT services are heavily used for physics, analysis and engineering applications.

      In part 2 of the series we will explore HTCondor, the high-throughput compute platform used for batch computing.

      Speaker: Ben Jones (CERN)
    • 11:00
      Break
    • 15
      Database Services (part 3 of 4) - DBoD maintenance exercises
      • What are the typical tasks to be performed as DBoD owner
        Exercises on:
      • Cloning mechanism
      • Upgrades
      • TLS certificates
      Speaker: Andrzej Nowicki (CERN)
    • 12:30
      Lunch
    • 16
      Application security

      Short introduction to best practices for secure development, testing and deployment

      • Three golden rules for system security
      • Software security, typical vulnerability types
      • How security analysis tools can help
      • Introduction to penetration testing
      • Deployment security best practices
      Speaker: Sebastian Lopienski (CERN)
    • 17
      Data Analysis Techniques using SWAN and REANA (part 1 of 3)

      In this first session, we will give an overview of the SWAN service. This will include the following points:
      - Interface: classic and JupyterLab
      - Creation of projects, notebooks and terminals
      - Integration with CVMFS for software provisioning
      - Integration with EOS for storage and CERNBox for sharing
      - Use of GPUs
      - Connection to Spark clusters

      Moreover, we will give a live demo that participants will be able to follow along and get familiar with the basic features of SWAN.

      Speakers: Diogo Castro (CERN), Enric Tejedor Saavedra (CERN), Pedro Miguel Esteves Maximino
    • 15:30
      Break
    • 18
      Deploying applications (part 1 of 2)

      In this lecture, we will understand the difference between IaaS, PaaS and SaaS.
      Then, we will learn how to deploy custom and off-shelf applications to OKD PaaS.

      Speaker: Alberto Pimpo
    • 19
      Deploying applications (part 2 of 2)

      Exercises regarding how to deploy custom and off-shelf applications to OKD PaaS.

      Speaker: Alberto Pimpo
    • 20
      Data Analysis Techniques using SWAN and REANA (part 2 of 3)

      In the second session of this series, we shall present REANA reusable and reproducible analysis platform. REANA allows researchers to structure their data analyses by means of declarative workflow languages (CWL, Snakemake, Yadage) and run containerised data analysis pipelines on remote compute clouds (Kubernetes, HTCondor, Slurm).

      In the first part of this session, we shall discuss the notions of computational reproducibility and reusability, underlying the importance of encapsulating the original computing environments by means of containers and documenting the steps necessary to arrive at results. We shall provide a brief introduction to declarative workflow languages and discuss its pros and cons when compared to imperative analysis code programming.

      In the second part of this session, the participants will familiarise themselves with the REANA platform by means of running a simple analysis example. We shall use the https://reana.cern.ch instance at CERN to run a RooFit demo example.

      Speakers: Marco Donadoni (CERN), Tibor Simko (CERN)
    • 21
      Services for Machine Learning applications (part 1 of 3)

      This session will introduce the different phases in a ML lifecycle, and how IT services can help in each of the parts. In particular, it will:

      • Overview of ML and use cases, containerization and how it helps out in defining single units of computation, isolate custom software environments, and ensure sustainability for reproducible results
      • Demo how cloud native environments (Kubernetes and its ecosystem) can help manage those units of computation and scale them out to large amounts of resources
      • Provide an example of how to scale out using both on-premises and public cloud resources, and when this might be useful and cost effective
      Speaker: Ricardo Rocha (CERN)
    • 11:00
      Break
    • 22
      Core compute services (part 3 of 4)

      An in depth set of use case where IT services are heavily used for physics, analysis and engineering applications.

      In part 3 of the series we will explore the Slurm, the technology underlying the HPC platform at CERN.

      Speaker: Nils Høimyr (CERN)
    • 12:30
      Lunch
    • 23
      Services for Machine Learning applications (part 2 of 3)

      This session will focus on the infrastructure and low level tools required to efficiently deploy machine learning applications. In particular, it will cover:

      • The different data types and how they can impact ML workloads, as well as support in different types of hardware and software libraries
      • Key differences between CPUs and GPUs and how they impact ML workloads (training and serving)
      • The available techniques in IT services for GPU sharing and partitioning. In particular, it will cover how applications can build on the existing Kubernetes service to simplify these operations
      • Hands-on exercises on using GPUs for different types of workloads
      Speaker: Diana Gaponcic (IT-PW-PI)
    • 24
      Services for Machine Learning applications (part 3 of 3)

      This session will focus on available ML techniques for distributed training of models, hyperparameter optimization and model service. In particular, starting from a well known use case it will demonstrate:

      • How to go from a script, to a docker image training on a single node, to a distributed training setup with multiple nodes
      • How to do hyperparameter optimization, which kind of optimizers are available, how to monitor the workloads and how to publish the models
      • How to serve models in production, at scale, with a simple http entrypoint or embedding the model in an application
      Speaker: Raulian-Ionut Chiorescu
    • 15:30
      Break
    • 25
      Authentication and authorization

      (Part of "Software development and hosting" track)

      In this class, we will see how to:

      • Enable authentication with CERN SSO
      • Define role-based authorization for our applications
      • Get tokens, and use them access APIs
      Speaker: Hannah Short (CERN)
    • 26
      Authentication and authorization (Excercises)

      (Part of "Software development and hosting" track)

      In this class, we will see how to:

      • Enable authentication with CERN SSO
      • Define role-based authorization for our applications
      • Get tokens, and use them access APIs
      Speaker: Hannah Short (CERN)
    • 27
      Lightning talks
      ID Name Title of my talk
      1 Nayana Bangaru Simulating the response of a silicon detector
      2 Gábor Bíró Computational Challenges in Image Reconstruction for Proton Computed Tomography
      3 Elena De la Fuente Garcia A new Open-Source 3D Time-Domain Electromagnetic Solver for Beam-Coupling Impedance Calculation
      4 Jesse Geens Solid: an open standard for structuring data, digital identities, and applications on the Web.
      5 Hannes Jakob Hansen How to Manage the Your ML Model Artifacts?
      6 João Ramiro How we use airflow
      7 Jonathan Samuel Improving Education within Computer Science
      Speakers: Elena De La Fuente Garcia (Universidad Politecnica de Madrid (ES)), Gabor Biro (HUN-REN Wigner Research Centre for Physics (HU)), Hannes Jakob Hansen, Jesse Geens, Joao Ramiro, Jonathan Samuel (CERN - IT-CD-DPP), Nayana Bangaru (Universita di Napoli Federico II (IT))
    • 28
      Data Analysis Techniques using SWAN and REANA (part 3 of 3)

      In the third session of this series, we will propose short exercises using SWAN and REANA to cover more data analysis examples and use cases. The session will be split in two parts, one for each tool, where participants will be able to work on the exercises and get assistance from the lecturers.

      Speakers: Diogo Castro (CERN), Enric Tejedor Saavedra (CERN), Marco Donadoni (CERN), Pedro Miguel Esteves Maximino, Tibor Simko (CERN)
    • 11:00
      Break and school photo
    • 29
      Core compute services (part 4 of 4)

      An in depth set of use case where IT services are heavily used for physics, analysis and engineering applications.

      Speaker: Giacomo Tenaglia (CERN)
    • 12:30
      Lunch
    • 30
      Database Services (part 4 of 4) - Oracle Database
      • Introduction of the Oracle Database service
      • Resource portal as a way to manage Oracle database users - needed e-groups
      • Other tooling provided by the team - Session Manager
      • How to connect to the Oracle database
      • Where to get the client?
      • What is the tnsnames.ora file?
      Speaker: Andrzej Nowicki (CERN)
    • 14:30
      Self Assessment
    • 16:00
      Break
    • 31