Biomedical data poses multiple hard challenges that break conventional machine learning assumptions. In this talk, I will highlight the need to transcend our prevalent machine learning paradigm and methods to enable them to become the driving force of new scientific discoveries. I will present machine learning methods that have the ability to bridge heterogeneity of individual biological...
As detector technologies improve, the increase in resolution, number of channels and overall size create immense bandwidth challenges for the data acquisition system, long data center compute times and growing data storage costs. Much of the raw data does not contain useful information and can be significantly reduced with veto and compression systems as well as online analysis.
We design...
Particle flow reconstruction is crucial to analyses performed at general-purpose detectors, such as ATLAS and CMS. Recent developments have shown that a machine-learned particle-flow reconstruction using graph neural networks offer a prospect for computationally efficient event reconstruction [1-2]. Focusing on scalability of machine-learning based models for full event reconstruction, we...
The High-Luminosity LHC (HL-LHC) will provide an order of magnitude increase in integrated luminosity and enhance the discovery reach for new phenomena. The increased pile-up foreseen during the HL-LHC necessitates major upgrades to the ATLAS detector and trigger. The Phase-II trigger will consist of two levels, a hardware-based Level-0 trigger and an Event Filter (EF) with tracking...
The combinatorics of track seeding has long been a computational bottleneck for triggering and offline computing in High Energy Physics (HEP), and remain so for the HL-LHC. Next-generation pixel sensors will be sufficiently fine-grained to the point of being able to determine angular information of the charged particle passing through. This detector technology immediately improves the...
Computing demands for large scientific experiments, such as the CMS experiment at CERN, will increase dramatically in the next decades. To complement the future performance increases of software running on CPUs, explorations of coprocessor usage in data processing hold great potential and interest. We explore the novel approach of Services for Optimized Network Inference on Coprocessors...
Due to the stochastic nature of hadronic interactions, particle showers from hadrons can vary greatly in their size and shape. Recovering all energy deposits from a hadronic shower within a calorimeter into a single cluster can be challenging and requires an algorithm that accommodates the large variation present in such showers. In this study, we demonstrate the potential of a deep learning...
In 2026 the Phase-II Upgrade will enhance the LHC to become the High Luminosity LHC. Its luminosity will be up to 7 times of the nominal LHC luminosity. This leads to an increase in interesting events which might open the door to detect new physics. However, it also leads to a major increase in proton-proton collisions with mostly low energetic hadronic particles, called pile-up. Up to 200...
We introduce the fwXmachina framework for evaluating boosted decision trees on FPGA for implementation in real-time systems. The software and electrical engineering designs are introduced, with both physics and firmware performance detailed. The test bench setup is described. We present an example problem in which fwXmachina may be used to improve the identification of vector boson fusion...
We present the preparation, deployment, and testing of an autoencoder trained for unbiased detection of new physics signatures in the CMS experiment Global Trigger test crate FPGAs during LHC Run 3. The Global Trigger makes the final decision whether to readout or discard the data from each LHC collision, which occur at a rate of 40 MHz, within a 50 ns latency. The Neural Network makes a...
We describe an application of the deep decision trees, described in fwXmachina part 1 and 2 at this conference, in fwXmachina for anomaly detection in FPGA for implementation in real-time systems. A novel method to train the decision-tree-based autoencoder is presented. We give an example in which fwXmachina may be used to detect a variety of different BSM models via anomaly detection at the...
In the next years the ATLAS experiment will undertake major upgrades to cope with the expected increase of luminosity provided by the Phase II of the LHC accelerator. In particular, in the barrel of the muon spectrometer a new triplet of RPC detector will be added and the trigger logic will be performed on FPGAs. We have implemented a new CNN architecture that is able to identify the muon...
This work describes the investigation of neuromorphic computing--based spiking neural network (SNN) models used to filter data from sensor electronics in the CMS experiments experiments conducted at the High Luminosity Large Hadron Collider (HL-LHC). We present our approach for developing a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum...
The processing of large volumes of high precision data generated by sophisticated detectors in high-rate collisions poses a significant challenge for major high-energy nuclear and particle experiments. To address this challenge and revolutionize real-time data processing pipelines, modern deep neural network techniques and AI-centric hardware innovations are being developed.
The sPHENYX...
The Large Hadron Collider will be upgraded to the High Luminosity LHC, delivering many more simultaneous proton-proton collisions, extending the sensitivity to rare processes. The CMS detector will be upgraded with new, highly granular, detectors in order to maintain performance in the busy environment with many overlapping collisions (pileup). For the first time, tracks from charged particles...
Data storage is a major limitation at the Large Hadron Collider and is currently addressed by discarding a large fraction of data. We present an autoencoder based lossy compression algorithm as a first step towards a solution to mitigate this problem, potentially enabling storage of more events. We deploy an autoencoder model, on Field Programmable Gate Array (FPGA) firmware using the hls4ml...
With machine learning gaining more and more popularity as a physics analysis tool, physics computing centers, such as the Fermilab LHC Physics Center (LPC), are seeing huge increases in their resources being used for such algorithms. These facilities, however, are not generally set up efficiently for machine learning inference as they rely on slower CPU evaluation, which has a noticeable...
The upcoming high-luminosity upgrade of the LHC will lead to a factor of five increase in instantaneous luminosity during proton-proton collisions. Consequently, the experiments situated around the collider ring, such as the CMS experiment, will record approximately ten times more data. Furthermore, the luminosity increase will result in significantly higher data complexity, thus making more...
The challenging environment of real-time systems at the Large Hadron Collider (LHC) strictly limits the computational complexity of algorithms that can be deployed. For deep learning models, this implies only smaller models that have lower capacity and weaker inductive bias are feasible. To address this issue, we utilize knowledge distillation to leverage both the performance of large models...
The exceptional challenges in data acquisition faced by experiments at the LHC demand extremely robust trigger systems. The ATLAS trigger, after a fast hardware data processing step, uses software-based selections referred to as the High-Level-Trigger (HLT). Jets originating from b-quarks (b-jets) are produced in many interesting fundamental interactions, making them a key signature in a broad...
BDTs are simple yet powerful ML algorithms with performance often at par with cutting-edge NN-based models. The structure of BDTs allows for a highly parallelized, low-latency implementation in FPGAs. I will describe the development and implementation of a BDT-based algorithm for tau lepton identification in the ATLAS Level-1 trigger system as part of the phase-I upgrade, designed to be...
The High Luminosity upgrade to the LHC will deliver unprecedented luminosity to the experiments, culminating in up to 200 overlapping proton-proton collisions. In order to cope with this challenge several elements of the CMS detector are being completely redesigned and rebuilt. The Level-1 Trigger is one such element; it will have a 12.5 microsecond window in which to process protons colliding...
Extracting low-energy signals from LArTPC detectors is useful, for example, for detecting supernova events or calibrating the energy scale with argon-39. However, it is difficult to efficiently extract the signals because of noise. We propose using a 1DCNN to select wire traces that have a signal. This efficiently suppresses the background while still being efficient for the signal. This is...
Graph structures are a natural representation of data in many fields of research, including particle and nuclear physics experiments, and graph neural networks (GNNs) are a popular approach to extract information from that. Simultaneously, there is often a need for very low-latency evaluation of GNNs on FPGAs. The HLS4ML framework for translating machine learning models from industry-standard...
Within the framework of the L1 trigger's data filtering mechanism, ultra-fast autoencoders are instrumental in capturing new physics anomalies. Given the immense influx of data at the LHC, these networks must operate in real-time, making rapid decisions to sift through vast volumes of data. Meeting this demand for speed without sacrificing accuracy becomes essential, especially when...
Recent years have witnessed the enormous success of the transformer models in various research fields including Natural Language Processing, Computational Vision as well as natural science territory. In the HEP community, models with transformer backbones have shown their power in jet tagging tasks. However, despite the impressive performance, transformer-based models are often large and...
The European Spallation Source (ESS) is multi-disciplinary research facility based on neutron scattering under construction in Lund. The facility includes a superconducting linear proton accelerator, a rotating tungsten target wheel where neutrons are spalled off by the high energy protons and a suit of instruments for neutron scattering experiments.
ESS is a user facility designed and...
Magnetic confinement fusion research is at a threshold where the next generation of experiments are designed to deliver burning fusion plasmas with net energy gain for the first time. ML holds great promise in reducing the costs and risks of fusion reactor development, by enabling efficient workflows for scenario optimization, reactor design, and controller design. This talk reviews various...
The exploration of extrasolar planets, which are planets orbiting stars other than our own, holds great potential for unravelling long-standing mysteries surrounding planet formation, habitability, and the emergence of life in our galaxy. By studying the atmospheres of these exoplanets, we gain valuable insights into their climates, chemical compositions, formation processes, and past...
The field of Astrodynamics faces a significant challenge due to the increasing number of space objects orbiting Earth, especially from recent satellite constellation deployments. This surge underscores the need for quicker and more efficient algorithms for orbit propagation and determination to mitigate collision risks in both Earth-bound and interplanetary missions on large scales. Often,...
Gamma-ray bursts (GRBs) have traditionally been categorized based on their durations. However, the emergence of extended emission (EE) GRBs, characterized by durations higher than two seconds and properties similar to short GRBs, challenges conventional classification methods. In this talk, we delve into GRB classification, focusing on a machine-learning technique (t-distributed stochastic...
Deep Learning assisted Anomaly detection is quickly becoming a powerful tool allowing for the rapid identification of new phenomena.
We present a method of anomaly detection techniques based on deep recurrent autoencoders to the problem of detecting gravitational wave signals in laser interferometers. This class of algorithm is trained via a semi-supervised strategy, i.e. with a weak...
Deep Learning (DL) applications for gravitational-wave (GW) physics are becoming increasingly common without the infrastructure to be validated at-scale or deployed in real-time. With ever more sensitive GW observing runs beginning in 2023, the tradeoff between speed and data robustness must be bridged in order to create experimental pipelines which take shorter to iterate upon and which...
In the Fermilab accelerator complex, the Main Injector (MI) and the Recycler Ring (RR) share a tunnel. The initial design was made for the needs of the Tevatron, where the RR stored fairly low intensities of anti-protons. Currently, however, both the MI and RR often have high intensity beams at the same time. Beam loss monitors (BLMs) are placed at different points in the tunnel to detect...
The Tokamak magnetic confinement fusion device is one leading concept design for future fusion reactors which require extremely careful control of plasma parameters and magnetic fields to prevent fatal instabilities. Magneto-hydrodynamic (MHD) instabilities occur when plasma confinement becomes unstable as a result of distorted non-axisymmetric magnetic field lines. These ``mode''...
Segmentation is the assigning of a semantic class to every pixel in an image, and is a prerequisite for downstream analysis like phase quantifcation, morphological characterization etc. The wide range of length scales, imaging techniques and materials studied in materials science means any segmentation algorithm must generalise to unseen data and support abstract, user-defined semantic...
Materials have marked human evolution throughout history. The next technological advancement will inevitably be based on a groundbreaking material. Future discovery and application of materials in technology necessitates precise methods capable of creating long-range, non-equilibrium structures with atomic accuracy. To achieve this, we need enhanced analysis tools and swift automated...
Accurate and reliable long-term operational forecasting is of paramount importance in numerous domains, including weather prediction, environmental monitoring, early warning of hazards, and decision-making processes. Spatiotemporal forecasting involves generating temporal forecasts for system state variables across spatial regions. Data-driven methods such as Convolutional Long Short-Term...
Surgical data technologies have not only been successfully integrated inputs from various data sources (e.g., medical devices, trackers, robots and cameras) but have also applied a range of machine learning and deep learning methods (e.g., classification, segmentation or synthesis) to data-driven interventional healthcare. However, the diversity of data, acquisitions and pre-processing...
The use of neural networks for approximating fermionic wave functions has become popular over the past few years as their ability to provide impressively accurate descriptions of molecules, nuclei, and solids has become clear.
Most electronic structure methods rely on uncontrolled approximations, such as the choice of exchange-correlation functional in density functional theory or the form...
High-dimensionality is known to be the bottleneck for both nonparametric regression and Delaunay triangulation. To efficiently exploit the geometric information for nonparametric regression without conducting the Delaunay triangulation for the entire feature space, we develop the crystallization search for the neighbour Delaunay simplices of the target point similar to crystal growth. We...
Beyond the well-known highlights in computer vision and natural language, AI is steadily expanding into new application domains. This Pervasive AI trend requires supporting diverse and fast-moving application requirements, ranging from specialized I/O to fault tolerance and limited resources, all the while retaining high performance and low latency. Adaptive compute architectures such as AMD...
How fast should your machine learning be? ideally, as fast as you can stream data to it.
In this presentation I will discuss the role of computing infrastructure in machine learning, and argue that to face the growing volume of data and support latency constraints, the best place for inference is within the network. I will introduce in-network machine learning, the offloading of machine...
Large Language Models (LLMs) will completely transform the way we interact with computers, but in order to be successful they need to be fast and highly responsive. This represents a significant challenge due to the extremely high computational requirements of running LLMs. In this talk, we look at the technology behind LLMs, its challenges, and why Groq's AI accelerator chip holds a...
Neural networks achieve state-of-the art performance in image classification, medical analysis, particle physics and many more application areas. With the ever-increasing need for faster computation and lower power consumption, driven by real-time systems and Internet-of-Things (IoT), field-programmable gate arrays (FPGAs) have emerged as suitable accelerators for deep learning applications....
Today’s deep learning models consume considerable computation and memory resources, leading to significant energy consumption. To address the computation and memory challenges, quantization is often used for storing and computing data as few as possible. However, exploiting efficient quantization for computing a given ML model is challenging, because it affects both the computation accuracy...
For many deep learning applications, model size and inference speed at deployment time become a major challenge. To tackle these issues, a promising strategy is quantization.
A straightforward uniform quantization to very low precision often results in considerable accuracy loss. A solution to this predicament is the usage of mixed-precision quantization, founded on the idea that certain...
There has been a growing trend of Multi-Modal AI models capable of gathering data from multiple sensor modalities (cameras, lidars, radars, etc.) and processing it to give more comprehensive output and predictions. Neural Network models, such as Transformers, Convolutional neural networks (CNNs), etc., exhibit the property to process data from multiple modalities and have enhanced various...
Field-programmable gate arrays (FPGAs) are widely used to implement deep learning inference. Standard deep neural network inference involves the computation of interleaved linear maps and nonlinear activation functions. Prior work for ultra-low latency implementations has hardcoded the combination of linear maps and nonlinear activations inside FPGA lookup tables (LUTs). Our work is motivated...
Machine learning has been applied to many areas of clinical medicine, from assisting radiologists with scan interpretation to clinical early warning scoring systems. However, the possibilities of ML-assisted real time data interpretationand the hardware needed to realise it are yet to be fully explored. In this talk, possible applications of fast ML hardware to real-time medical imaging will...
Converged compute infrastructure refers to a trend where HPC clusters are set up for both AI and traditional HPC workloads, allowing these workloads to run on the same infrastructure, potentially reducing underutilization. Here, we explore opportunities for converged compute with GroqChip, an AI accelerator optimized for running large-scale inference workloads with high throughput and...
Machine Learning has gone through major revolutionary phases over the past decade and neural networks have become state-of-the-art approaches in many applications, from computer vision to natural language processing. However, these advances come at ever-growing computational costs, in contrast, CMOS scaling is hitting fundamental limitations such as power consumption and quantum mechanical...
Convolutional Neural Networks (CNNs) have been applied to a wide range of applications in high energy physics including jet tagging and calorimetry. Due to their computational intensity, a large amount of work has been done to accelerate CNNs in hardware, with FPGA devices serving as a high-performance and energy-efficient platform of choice. As opposed to a dense computation where every...
The contribution addresses the topic of time-series recognition, specifically comparing the conventional approach of manual feature extraction with contemporary classification methods that leverage features acquired through the training process. Employing automated feature extraction software, we attained a high-dimensional representation of a time-series, obviating the necessity of...
Universal approximation theorems are the foundations of classical neural networks,
providing theoretical guarantees that the latter are able to approximate maps of interest.
Recent results have shown that this can also be achieved in a quantum setting,
whereby classical functions can be approximated by parameterised quantum circuits.
We provide here precise error bounds for specific...
Deep learning techniques have demonstrated remarkable performance in super resolution (SR) tasks for enhancing image resolution and granularity. These architectures extract image features with a convolutional block and add the extracted features to the upsampled input image transported through a skip connection, which is then converted from a depth to higher resolution space. However, SR can...
Zoom link: https://cern.zoom.us/j/63951739685?pwd=VTdITmdvOTc3V1hyK0xPa2t6cjhUdz09
This two-part tutorial presents an update on Intel HLS flow and the Intel FPGA AI Suite. In the first part, we will have a 30-minute update on how the latest oneAPI tool flow for IP authoring works. In the second part we will present Intel FPGA AI Suite and groundbreaking AI Tensor Blocks newly integrated into Intel's latest FPGA device families for deep learning inference. These...
More and more researchers working in fields such as drug discovery, weather forecasting, climate modelling and high-energy particle physics are looking towards AI-based approaches to enhance their applications, both in terms of accuracy and time-to-result. Furthermore, new approaches such as PINNs are revolutionising how neural networks can learn to emulate physical systems governed by...