CERN openlab Summer Student Lightning Talks (2/2)

Europe/Zurich
31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105
Show room on map
Description

On Tuesday 15th and Wednesday 16th of August, the CERN openlab 2023 summer students will present their work at two dedicated public Lighting Talk sessions.

In a 5-minute presentation, each student will introduce the audience to their project, explain the technical challenges they have faced and describe the results of what they have been working on for the past two months.

It will be a great opportunity for the students to showcase the progress they have made so far and for the audience to be informed about various information-technology projects, the solutions that the students have come up with and the potential future challenges they have identified.

Please note 

  • only the students need to register for the event
  • There are 15 places available on the Tuesday and 15 places on the Wednesday
  • the event will  be accessible via webcast for an external audience (Please invite your university professors and other students)
From the same series
1
Webcast
There is a live webcast for this event
    • 15:00 15:05
      Welcome 5m
      Speaker: Miguel Marquina (CERN)
    • 15:05 15:12
      Accelerator Control using Gaussian Process Model Predictive Control Based Reinforcement Learning 7m

      Abstract: This project aims to develop a model-based Reinforcement Learning (RL) algorithm, based on the GP-MPC(Gaussian Process-Model Predictive Control) approach, for controlling accelerator systems using Machine Learning. By leveraging insufficient data and uncertainty quantification capabilities of Gaussian Processes, the algorithm will overcome limitations related to beam instrumentation and training large iterations. The final outcome will be the integration of the modified algorithm into CERN's Generic Optimisation Framework, facilitating easy deployment for operational scenarios and enabling intelligent accelerator control.

      Speaker: Su Aye
    • 15:12 15:19
      Counterdiabatic drive and quantum optimal control for spin annealing schedules 7m

      Adiabatic protocols are critical subroutine in many quantum technologies. Accelerating these processes is challenging, not only due to suboptimal hardware implementations, but also for fundamental reasons. Indeed, a class of adiabatic theorems bounds the maximum fidelity that a particular protocol may accomplish. Furthermore, phase transitions and narrow gap systems are inherently difficult to address.

      In this work, we integrate counterdiabatic drive with quantum optimum control techniques (COLD), as proposed in PRX Quantum 4, 010312. Using symbolic calculus, we expand the formulation of the adiabatic gauge potential ansatz to any spin hamiltonian. Finally, we put COLD to the test on frustrated systems and 2D topologies.

      Speaker: Francesco Pio Barone
    • 15:19 15:26
      Shelter tents mapping by machine learning and satellite imagery 7m

      In order to provide urgent and life-saving resources to forcibly displaced people, spontaneous measurement of the number is essential. Instead of directly measuring this number, a popular approach is to estimate the number of shelter tents by satellite imagery. However, traditionally it has to be done by manual labeling, which is labor- and time-consuming. In this project, we propose a solution: using machine learning techniques and satellite imagery to evaluate the number of shelter tents.

      Speaker: Hsiu-Chi Cheng
    • 15:26 15:33
      Load Testing and Benchmarking EOS Open Storage 7m

      This project focuses on creating a modular load testing and benchmarking tool for testing file systems, specifically EOS storage system. EOS supports storage for data from CERN experiments as well as personal user data. It relies on protocols like FUSE and XRootd, which this project aims to test and analyze for any errors or weaknesses. The goal was to create an automated workflow generator and load testing tool, capable of assessing performance, testing the infrastructure and comparing metrics after changing software or hardware configurations.

      Speaker: Andrej Cop
    • 15:33 15:40
      Saving Database Copies in the OCI Immutable Storage 7m

      Databases managed by IT-DA are crucial for CERN's operations. Although many mechanisms for preventing data loss are already in place, our data might still be vulnerable to ransomware attacks. This project proposes saving database copies outside of a potential attacker's reach, using a cloud feature called immutable buckets.

      Speaker: Mr Andrei-Dorian Duma
    • 15:40 15:47
      Automating dependencies' updates for the Drupal Distribution repository 7m

      Keeping a project's dependencies updated to the most recent version has a direct impact on efficiency and security, therefore the need for automating this process every so often arises.
      The Renovate tool helps us solve this issue and create automatic Merge Requests with the most recent updates (minor and major) for every dependency.

      Speaker: Monica Jaqueline Iniguez Moncada
    • 15:47 15:54
      Exploring Cybersecurity Frontiers: Challenges regarding 2FA, Incident Response, and Web Scanning 7m

      In my CERN openlab 2023 summer tenure, I undertook three cybersecurity projects. Firstly, I addressed the challenge of integrating two-factor authentication (2FA) standards—FIDO2 and OTP—across CERN systems. Despite intensive efforts, the dissonance between these protocols posed insurmountable obstacles to unification. Secondly, I engaged in translating and dissecting IRC chat logs and Telegram conversations of Romanian hacker collectives implicated in the MICI-BICA incident. My role involved decoding strategies, exposing potential threat vectors, and uncovering their tactics to safeguard CERN and affiliated institutions. Lastly, I am currently developing a Python tool for technology detection of 17,000+ CERN websites. This task entails migrating the tool to Python 3, automating core functions, and integrating with the new Single Sign-On. The future plan involves optimizing the tool's capabilities using Go and implementing a versatile vulnerability scanner named Nuclei, leveraging a YAML-based DSL.

      Speaker: Mihai Licu
    • 15:54 16:01
      ML for Fast Simulation 7m

      High-energy physics (HEP) experiments rely on Monte Carlo (MC) simulation to test hypotheses and understand data distributions. To meet the demand for fast and large-scale simulated samples, novel Fast Simulation techniques have emerged, including neural network models. This project explores the application of a large-scale transformer-based model, leveraging recent advancements in foundation models (e.g., GPT-3, Dall-E-2), for fast simulation in HEP. My contribution involves working on preprocessing and loss function design to enhance the efficiency and accuracy of the proposed transformer-based model and compare their results with other existing methodologies.

      Speaker: Zeeshan Memon
    • 16:01 16:08
      Evaluating the integration of distributed tracing signals into the CERN Monitoring Infrastructure 7m

      In order to have system monitoring, we need telemetry – data about the system’s behavior, emitted from the system itself. Telemetry data comes in three forms – logs, metrics and traces. Logs and metrics are already utilized by the current CERN Monitoring infrastructure. Logs provide information on individual events and include some local context, providing debugging/diagnostic information by describing the immediate surrounding of the event. On the other hand, metrics give system/service level information though the aggregation of measurements across time. Just by using logs and metrics we have discontinuity of information – we can observe individual events and we can observe the system state across time, but reasoning about causality between the two is left as a responsibility of the developer/engineer. Traces aspire to bridge this gap; they link each individual event with the tree of invocation dependencies from the user request that started the chain to all of the side effects it had in the system. The goal of this project is to create a proof-of-concept deployment of the distributed tracing infrastructure, evaluate the possibilities of its integration with the existing monitoring infrastructure and assess the possibilities these additions would unlock towards improving the verbosity and quality of information the monitoring team can provide its clients regarding the behavior of their systems.

      Speaker: Maša Nešić
    • 16:08 16:15
      FPGA-Accelerated Neural Network Inference for Ultra-Low-Latency Recalibration and Classification of Physics Objects at 40 MHz within CMS 7m

      In the realm of data processing and physics analysis at the Large Hadron Collider (LHC), there exists a notable advantage of deep learning-based algorithms over traditional physics-based counterparts. This study explores cutting-edge methodologies for the low latency neural network inference on Field Programmable Gate Array (FPGA) devices. Specifically, we concentrate on the recalibration and classification of physics objects at a demanding rate of 40 MHz within the CMS framework. The primary objective of this work is to develop an ultra-low-latency neural network model, strategically combining various techniques such as Quantization Aware Training, Knowledge Distillation, transfer learning, and Pruning schedules to achieve an exceptionally low latency without compromising the integral reconstruction performance.

      Speaker: Diptarko Choudhury (National Institute of Science Education and Research)
    • 16:15 16:22
      Inference of ML models on Intel GPU with SYCL and Intel OneAPI using SOFIE 7m

      TMVA provides a fast interference system that takes an ONNX model as input and produces compilation-ready standalone C++ scripts as output which can be compiled and executed on CPU architectures. The idea of this project is to extend this capability to generate from the TMVA SOFIE model representation code that can be run also on Intel GPU using both SYCL and Intel OneAPI libraries. These will allow for a more efficient evaluation of these models on Intel accelerator hardware.

      Speaker: Ioanna Maria Panagou
    • 16:22 16:29
      Quantum-Powered Time Series Forecasting in Finance: Replication, Reliability, and Architectural Exploration 7m

      Machine learning has enabled computers to learn from data and improve their performance on tasks, revolutionizing various industries by automating processes, uncovering insights, and enhancing decision-making.

      NISQ (Noisy Intermediate-Scale Quantum) refers to the current stage of quantum computing, characterized by quantum devices with a moderate number of qubits and significant noise. In the NISQ era, parametrized quantum circuits (PQC) are employed in machine learning as flexible quantum algorithms that can be executed on current noisy quantum devices. By introducing tunable parameters, these circuits can adapt to the limitations of NISQ devices, enabling quantum-enhanced solutions to classification and generative tasks while paving the way for practical applications of quantum machine learning.

      Time series arise from the abundance of sequential data in various fields, and time series analysis uncovers patterns, trends, and dependencies within the temporal data, enabling informed decision-making and predictions for a wide range of applications.

      In this project, we aim to replicate the findings of a paper that utilized parametrized quantum circuits (PQCs) to forecast time series data in the context of finance. By faithfully reproducing the original experiments, we seek to establish the reliability and applicability of PQCs for financial time series predictions.

      Building upon these results, we further investigate the impact of hyperparameters and explore novel circuit architectures to optimize the performance of quantum machine learning in time series forecasting.

      Speaker: Andrew Charles Spiro
    • 16:29 16:36
      "Allen" oneAPI code on FPGA 7m
      Speaker: Eleni Xochelli
    • 16:36 16:43
      Power efficiency of HEP applications on CPU & GPU 7m

      The LHC generates an immense volume of data by colliding protons or heavy ions at extremely high energies, resulting in a multitude of particle interactions. These interactions are crucial for experiments conducted at the LHC, such as ATLAS, CMS, ALICE, and LHCb, and require recording, processing, and analysis. To address this data challenge, the WLCG collaborates globally, offering computing resources and services to support these experiments. To understand which architecture is most suitable for certain kinds of jobs/workloads, we need to benchmark the workloads for the various experiments at CERN. My assignment involved benchmarking these workloads, especially MadGraph, on both CPU and GPU. The goal was to
      study the relationship between energy consumption and performance on different architectures.

      Speaker: Keshvi Tuteja
    • 16:40 17:00
      Closing remarks 20m
      Speaker: Miguel Marquina (CERN)