Fast Machine Learning for Science Conference 2025

Name: Fast Machine Learning for Science Conference 2025
Start: 2025-09-01T08:30:00+02:00
End: 2025-09-05T17:30:00+02:00
Location: ETH Zurich

1–5 Sept 2025

ETH Zurich

Europe/Zurich timezone

Local organisers

fml-2025-organisers@cern.ch

Contribution List

2. Registration

01/09/2025, 08:30

1. hls4ml

Benjamin Ramhorst (ETH Zurich)

01/09/2025, 09:00

Tutorials

Tutorial

Tutorials

FPGAs provide unique advantages in the realm of machine learning acceleration. Unlike CPUs and GPUs, FPGAs allow for custom parallelism, data type precision and dataflow tailored specifically to the workload. Their reconfigurability enables the design of optimised hardware circuits that can reduce latency, power consumption, and improve throughput. Some common examples of FPGA-accelerated...

72. NeuraLUT-Assemble: Hardware-aware Assembling of Sub-Neural Networks for Efficient LUT Inference

Marta Andronic (Imperial College London), Mr Oliver Cassidy (Imperial College London)

01/09/2025, 09:00

Tutorial

Tutorials

Neural networks (NNs) have gained significant interest in recent years due to their prevalence in AI applications. Lookup table (LUT) based NN architectures have emerged as a promising solution for ultra-low latency inference on reconfigurable hardware such as field programmable gate arrays (FPGAs). These techniques promise significant enhancements in both resource efficiency and inference...

123. Part 1: Agile Hardware Design for AI: A Hands-On Tutorial with SODA and Bambu

Mr Giovanni Gozzi (Politecnico di Milano), Mr Michele Fiorito (Politecnico di Milano), Dr Vito Giovanni Castellana (Pacific Northwest National Laboratory), Dr Antonino Tumeo (Pacific Northwest National Laboratory), Fabrizio Ferrandi (Politecnico di Milano)

01/09/2025, 09:00

Tutorial

Tutorials

This tutorial explores the growing demand for domain-specific hardware accelerators driven by the rapid evolution of AI and data analytics. Traditional hardware design cycles are too slow to keep up with the pace of algorithmic innovation. To address this, new agile hardware design methodologies are emerging, leveraging compiler technologies and High-Level Synthesis (HLS) to automate and...

18. Coyote v2: Open-source Abstractions and Infrastructure for FPGAs

Benjamin Ramhorst (ETH Zurich)

01/09/2025, 11:00

Tutorial

Tutorials

As Moore’s Law and Dennard Scaling reach their limits, computing is shifting toward heterogeneous hardware for large-scale data processing. Cloud vendors are deploying accelerators, like GPUs, DPUs, and FPGAs, to meet growing computational demands of ML and big data.

While FPGAs offer great flexibility and performance, practically integrating them in larger systems remains challenging due...

53. Designing and Deploying Low-Latency Neural Networks on FPGAs with HGQ and da4ml

Chang Sun (California Institute of Technology (US))

01/09/2025, 11:00

Tutorials

Tutorial

Tutorials

Neural networks with a latency requirement on the order of microseconds are widely used at the CERN Large Hadron Collider, particularly in the low-level trigger system. To satisfy this latency requirement, these neural networks are often deployed on FPGAs.

This tutorial aims to provide a practical, hands-on guide of a software-hardware co-design workflow using the HGQ2 and da4ml libraries....

153. Part 2: Agile Hardware Design for AI: A Hands-On Tutorial with SODA and Bambu

01/09/2025, 11:00

Tutorial

Tutorials

82. Super Neural Architecture Codesign Package (SNAC-Pack)

Dmitri Demler

01/09/2025, 11:00

Tutorial

Tutorials

Machine learning has become a critical tool for analysis and decision-making across a wide range of scientific domains, from particle physics to materials science. However, the deployment of neural networks in resource-constrained environments, such as hardware accelerators and edge devices, remains a significant challenge. This often requires specialized expertise in both neural architecture...

151. Registration

01/09/2025, 12:30

10. Reasoning Language Models: Overview and Blueprint

Maciej Besta (ETH Zurich)

01/09/2025, 14:00

Invited Talks

Invited talks

11. Real-time mixed-signal electronic circuits for understanding and implementing neural computation

Giacomo Indiveri (ETH Zurich)

01/09/2025, 14:45

Invited Talks

Invited talks

While machine learning has made tremendous progress in recent years, there is still a large gap between artificial and natural intelligence.
Closing this gap requires combining fundamental research in neuroscience with mathematics, physics, and engineering to understand the principles of neural computation and cognition.
Mixed-signal subthreshold analog and asynchronous digital electronic...

13. Real-time inference at the LHC

Vava Gligorov (Centre National de la Recherche Scientifique (FR))

01/09/2025, 15:50

Invited Talks

Invited talks

The real-time processing of data created by the Large Hadron Collider's (LHC) experiments, amounting to over 10% of worldwide internet traffic, is one of the greatest computing challenges ever attempted. I will discuss the concrete applications of real-time processing in the LHC's main experiments, and the technological innovations in this area over the past decades. I will also reflect on the...

14. Real-time inference in gravitational wave astronomy (REMOTE)

Maximilian Dax (ELLIS Institute Tübingen)

01/09/2025, 16:35

Invited Talks

Invited talks

152. Commute to ETH Main Building (by ETH link 17:34 or 17:54). Meet Polyterrasse

01/09/2025, 17:20

12. The JAX scientific ecosystem

Patrick Kidger (Cradle.bio)

02/09/2025, 09:00

Invited Talks

Invited talks

This talk provides an overview of several libraries in the open-source JAX ecosystem (such as Equinox, Diffrax, Optimistix, ...) In short, we have been building an "autodifferentiable GPU-capable scipy". These libraries offer the foundational core of tools that have made it possible for us to train neural networks (e.g. score-based diffusions for image generation), solve PDEs, and smoothly...

141. Wearable TinyML Platforms for Biosignal Intelligence Across the Body

Dr Andrea Cossettini (ETH Zurich)

02/09/2025, 09:45

Invited Talks

Invited talks

Most commercial wearables still capture only basic metrics such as step counts or heart rate, and remain closed systems without access to raw data. In this talk, I will present our holistic approach to full-body biosignal intelligence, where ultra-low-power embedded platforms and machine learning algorithms are co-designed to capture and process signals from the brain, eyes, muscles, and...

142. Real-time ML and neuromorphic computing for smart robots

Yulia Sandamirskaya (Zurich University of Applied Sciences)

02/09/2025, 11:00

Invited Talks

Invited talks

103. FINN+: Towards Hassle-Free Co-Design of FPGA DNN Inference Accelerators

Felix Jentzsch

02/09/2025, 13:00

Standard Talk

Contributed talks

Custom FPGA dataflow accelerators for DNN inference can enable unprecedented performance and efficiency for many applications. Dataflow accelerator compilers, such as the FINN framework, have improved in recent years and allow practitioners to explore this technology without requiring in-depth FPGA knowledge.

However, the overall design process remains quite tedious, time-consuming, and...

22. End-to-End Neural Network Compression and Deployment for Hardware Acceleration Using PQuant and hls4ml

Roope Oskari Niemi

02/09/2025, 13:20

Standard Talk

Contributed talks

As the demand for efficient machine learning on resource-limited devices grows, model compression techniques like pruning and quantization have become increasingly vital. Despite their importance, these methods are typically developed in isolation, and while some libraries attempt to offer unified interfaces for compression, they often lack support for deployment tools such as hls4ml. To...

99. PrioriFI: Efficient Fault Injection for Edge Neural Networks

Olivia Weng

02/09/2025, 13:40

Standard Talk

Contributed talks

As neural networks (NNs) are increasingly used to provide
edge intelligence, there is a growing need to make the edge devices
that run them robust to faults. Edge devices must mitigate the resulting
hardware failures while maintaining strict constraints on power, energy,
latency, throughput, memory size, and computational resources. Edge
NNs require fundamental changes in model...

107. ENABOL: Enabling Neural Backpropagation On-chip Learning for Edge AI Systems

Manuel Valentin (Northwestern University)

02/09/2025, 14:00

Standard Talk

Contributed talks

On-chip learning has the potential to unlock low-latency, low-power, and continuously adaptive AI directly on edge devices. However, research in this area remains limited by the lack of accessible hardware toolchains that support backpropagation. To address this gap, we propose ENABOL, a hardware-efficient extension of the HLS4ML toolchain that enables customizable backpropagation support...

129. da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

Chang Sun (California Institute of Technology (US))

02/09/2025, 14:20

Standard Talk

Contributed talks

Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs pipelined with II=1. A bottleneck for the deployment of such neural networks is area utilization, which is directly related to the required constant matrix-vector multiplication (CMVM) operations. In this work, we propose an efficient...

31. AMD AI-Engines in fixed latency environments

Ioannis Xiotidis (CERN)

02/09/2025, 14:40

Standard Talk

Contributed talks

The ATLAS Level-0 Global Trigger is a mission critical system opting to take advantage of the full calorimeter granularity during Run-4 and beyond. Level-0 Global will be executing a cascade of trigger algorithms combined both the calorimeter information and the muons. Within the Next Generation Trigger project at CERN there is a dedicated work package (WP2.1) exploring large deployment of...

139. Balancing Prediction Performance, Transparency and Energy Consumption in Machine Learning Models for Data Streams

Kirsten Köbschall

02/09/2025, 16:00

Standard Talk

Contributed talks

In the era of continuous data generation, real-time processing of data streams has become crucial for timely, adaptive, and context-aware decision-making. However, maintaining effective learning models in such dynamic environments requires carefully balancing prediction performance, transparency and energy consumption.

In the talk, we will present two new state-of-the-art methods for...

113. Arbolta: A Fault Tolerance Study using Minimal Hardware Simulation

Alexander Redding (UC San Diego)

02/09/2025, 16:20

Standard Talk

Contributed talks

The widespread deployment of embedded ML systems has created a need for resilient, fault-tolerant hardware and software capable of operating in inherently noisy conditions. While the standardization of low-precision (≤ 8-bit) datatypes has allowed for reduced training and inference costs and increased interoperability across commercial accelerators, clear guidelines for robust implementation...

108. SuperSONIC: Cloud-Native Infrastructure for ML Inferencing

Yuan-Tang Chou (University of Washington (US))

02/09/2025, 16:40

Standard Talk

Contributed talks

The rising computational demands of increasing data rates and complex machine learning (ML) algorithms in large-scale scientific experiments have driven the adoption of the Services for Optimized Network Inference on Coprocessors (SONIC) framework. SONIC accelerates ML inference by offloading tasks to local or remote coprocessors, optimizing resource utilization. Its portability across diverse...

90. Embedding domain knowledge: Inductive biases for algorithmic alignment in Machine Learning

Serio Angelo Maria Agriesti (Department of Technology, Management and Economics, Technical University of Denmark, Lyngby, Denmark)

02/09/2025, 17:00

Standard Talk

Contributed talks

Most of the current machine learning (ML) applications are purely data-driven solutions with little considerations for underlying problem dynamics, limited to in-distribution applications. To tackle this limitation a stream of literature is emerging to address out-of-distribution (OOD) performance: Algorithmic alignment, which focuses on embedding algorithmic structures into ML architectures...

38. Pushing Matrix-Vector Multiplication Performance on AMD AI Engines for Low-Latency Edge Inference

Dimitrios Danopoulos (CERN)

02/09/2025, 17:20

Standard Talk

Contributed talks

Matrix-vector (GEMV) operations are a common building block in many deep learning models, particularly for large dense layers found in convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs). Despite their importance, GEMV kernels have historically underperformed compared to matrix-matrix (GEMM) operations due to their lower arithmetic intensity and limited data reuse, making...

143. The Instrumental Edge: Real-Time and AI-Ready Scientific Discovery

Adam Thompson (NVIDIA)

03/09/2025, 09:00

Invited Talks

Invited talks

From radio telescopes to particle accelerators and electron microscopes, scientific instruments produce tremendous amounts of data at equally high rates; previous architectures that have relied on offline storage and large data transfers are unable to keep up. The future of scientific discovery is interactive, streaming, and AI driven, placing the autonomous and intelligent instrument at the...

144. AI for next-gen cellular networks (6G)

Bozidar Radunovic (Microsoft Research)

03/09/2025, 09:45

Invited Talks

Invited talks

149. Next Generation GPU Signal Processing Pipeline for Radio Astronomy

Luigi Cruz (SETI)

03/09/2025, 11:00

Invited talks

As digitizer technologies scale, efficient processing of massive amounts of sensor data is essential for the next generation of science projects. This talk focuses on the next-generation electromagnetic signal processing pipeline developed at the Allen Telescope Array. Backed by the NVIDIA Holoscan SDK, this pipeline utilizes cutting-edge technologies to address the three key pillars of...

161. Conference photo outside of auditorium!

03/09/2025, 11:45

65. Real-Time Anomaly Detection in the CMS Level-1 Trigger with AXOL1TL

Sabrina Giorgetti (Universita e INFN, Padova (IT))

03/09/2025, 13:00

Standard Talk

Contributed talks

AXOL1TL is an anomaly detection (AD) trigger algorithm integrated into the Global Trigger (GT) of the CMS Level-1 Trigger (L1T) system since 2024. The GT reduces the event rate from proton–proton collisions at the LHC, lowering it from 40 MHz to 100 kHz within a fixed latency of 50 ns. The AD algorithm, implemented in the FPGA firmware of the GT board, uses an autoencoder to assign an anomaly...

85. GELATO: A Generic Event-Level Anomaly detection Trigger for ATLAS

Kenny Jia (Stanford University/ SLAC)

03/09/2025, 13:20

Contributed Talks

Standard Talk

Contributed talks

The absence of BSM physics discoveries at the LHC suggests new physics could lie outside current trigger schemes. By applying unsupervised ML–based anomaly detection, we gain a model-agnostic way of spotting anomalous signatures that deviate from the current trigger’s expectations. Here we introduce a Run-3 trigger chain that embeds fast anomaly detection algorithms in both hardware and...

35. Advancing the CMS Level-1 Trigger: Jet Tagging with DeepSets at the HL-LHC

Christopher Edward Brown (CERN)

03/09/2025, 13:40

Standard Talk

Contributed talks

At the Phase-2 Upgrade of the CMS Level-1 Trigger (L1T), particles will be reconstructed by linking charged particle tracks with clusters in the calorimeters and muon tracks from the muon station. The 200 pileup interactions will be mitigated using primary vertex reconstruction for charged particles and a weighting for neutral particles based on the distribution of energy in a small area. Jets...

84. Low-Latency On-Chip Tau Event Selection with Machine Learning for the Belle II Level-1 Trigger

Deven Misra (University of Tokyo)

03/09/2025, 14:00

Standard Talk

Contributed talks

Belle II is a luminosity frontier experiment located at the SuperKEKB asymmetric $e^+ e^-$ collider, operating at the $\Upsilon(4S)$ resonance. The $\tau$ physics program at Belle II involves both probes of new physics and precision measurements of standard model parameters with large statistics. SuperKEKB is projected to reach a luminosity of $6\times 10^{35}~\text{cm}^{-2}\text{s}^{-1}$ in...

93. Accelerated Graph Neural Network Inference on FPGAs for Real-Time Muon Triggering at the HL-LHC

Davide Fiacco (Sapienza Universita e INFN, Roma I (IT))

03/09/2025, 14:20

Standard Talk

Contributed talks

The High Luminosity upgrade of the Large Hadron Collider (HL-LHC) presents a demanding environment for real-time data processing, with substantially increased event rates requiring faster and more efficient trigger systems. This study explores the deployment of graph neural networks (GNNs) on field-programmable gate arrays (FPGAs) for fast and accurate inference within future muon trigger...

27. Jet finding in real-time using an object detection CNN

Leon Bozianu (Universite de Geneve (CH))

03/09/2025, 14:40

Standard Talk

Contributed talks

The ATLAS trigger system will undergo a comprehensive upgrade in advance of the HL-LHC programme. In order to deal with the increased data bandwidth trigger algorithms will be required to satisfy stricter latency requirements. We propose a method to speed up the current calorimeter-only preselection step and to aid trigger decisions for hadronic signals containing jets.
We demonstrate the use...

134. KAN-LUT: Efficient LUT-Based Acceleration of Kolomogorov-Arnold Networks (KANs) on FPGAs

Duc Hoang (Massachusetts Inst. of Technology (US))

03/09/2025, 16:00

Standard Talk

Contributed talks

Optimized FPGA implementations of tiny neural networks are crucial for low-latency and hardware-efficient inference for a variety of applications. Neural networks based on lookup tables (LUTs) are a standard technique for such problems due to their hardware efficiency and strong expressivity. However, such networks are often difficult to scale up as their resource usage scales exponentially...

133. COLLIDE-2V - 750 Million Dual-View LHC Event Dataset for Low-Latency ML

Eric Anton Moreno (Massachusetts Institute of Technology (US))

03/09/2025, 16:20

Standard Talk

Contributed talks

Modern foundation models (FMs) have pushed the frontiers of language, vision, and multi-model tasks by training ever-larger neural networks (NN) on unprecedented volumes of data. The use of FM models has yet to be established in collider physics, which both lack a comparably sized, general-purpose dataset on which to pre-train universal event representations, and a clear demonstrable need....

128. Accelerating Efficient Transformer Architectures for Point Cloud Data using hls4ml (REMOTE)

Jan-Frederik Schulte (Purdue University (US))

03/09/2025, 16:40

Standard Talk

Contributed talks

The analysis of point cloud data, for example signals from charged particles recorded by detectors in high energy physics (HEP) experiments, can be significantly enhanced and accelerated by the application of machine learning models. In recent years, transformer architectures have come into focus as offering excellent model performance. However, for traditional transformers,the need to compute...

86. Hierarchical Dataflow Accelerator of Interaction Networks for Large-Scale Particle Tracking on FPGA

Bo-Cheng Lai

03/09/2025, 17:00

Standard Talk

Contributed talks

The Interaction Network (IN) algorithm has shown great promise for particle tracking applications at the Large Hadron Collider (LHC), where identifying complex particle trajectories from raw detector data is a computationally intensive task. IN leverages graph-based representations of detector hits to learn relationships between particle interactions, making it well-suited for this domain....

159. An ML Pipeline for Real-time Gravitational Wave Alerts

Christina Reissel (Massachusetts Inst. of Technology (US)), Katya Govorkova (Massachusetts Inst. of Technology (US)), Philip Coleman Harris (Massachusetts Inst. of Technology (US))

03/09/2025, 17:20

Contributed talks

162. Map to Tessin Grotto

03/09/2025, 17:40

145. Scaling up Advanced, Near-Sensor AI: An Open Platform Approach

Luca Benini (ETH Zurich)

04/09/2025, 09:30

Invited Talks

Invited talks

AI is accelerating into the generative era, and it is poised to disrupt multiple businesses and applications. With the increasing focus on edge and extreme-edge, near sensor applications, inference is becoming the key workload and computational challenge. Computing system need to scale out and scale up to meet the challenge. In this talk I will discuss how to scale up chip(lets) for efficient...

146. Co-Design for Efficient & Adaptive ML

Yaman Umuroglu (AMD)

04/09/2025, 10:15

Invited Talks

Invited talks

Beyond the well-known highlights in computer vision and natural language, AI is steadily expanding into new application domains. This Pervasive AI trend requires supporting diverse and fast-moving application requirements, ranging from specialized I/O to fault tolerance and limited resources, all the while retaining high performance and low latency. Adaptive compute architectures such as AMD...

43. Towards a Self-Driving Trigger: Adaptive Response in Real Time

Giovanna Salvi (University of Michigan (US))

04/09/2025, 11:30

Standard Talk

Contributed talks

The trigger systems of ATLAS and CMS currently reject vast numbers of potentially valuable collision events due to their conservative, static designs, a limitation that directly hampers discovery potential. We propose an alternative to these rigid, hand-tuned menus with an autonomous controller capable of dynamically optimizing trigger performance in real time.
In this work, we demonstrate...

37. Easing the path to deployment in ML4Sys through FPGAs

Maximilian Heer (ETH Zurich)

04/09/2025, 11:50

Standard Talk

Contributed talks

Machine Learning (ML) techniques are increasingly applied for the optimization of complex computing systems, but their integration into core low-level system mechanisms remains limited. A key barrier is the lack of accessible, high- performance interfaces at the boundary between software and hardware as well as hardware-offloaded ML-inference at full systems speed. In this presentation, we...

32. Go small then go home - hyperparameter transfer for ML in HEP

Liv Helen Vage (Princeton University (US))

04/09/2025, 12:10

Standard Talk

Contributed talks

Tuning hyperparameters of ML models, especially large ML models, can be time consuming and computationally expensive. As a potential solution, several recent papers have explored hyperparameter transfer. Under certain conditions, the optimal hyperparameters of a small model are also optimal for larger models. One can therefore tune only the small model and transfer the hyperparameters to the...

160. Edge Deep Learning for Particle Physics (EPIGRAPHY)

Benedikt Maier (Imperial College (GB))

04/09/2025, 12:30

Contributed talks

17. Fast ML foundation discussion

04/09/2025, 12:40

102. Low-Latency Resource-Efficient GNNs for Jet Tagging on FPGAs

Zhiqiang (Walkie) Que (Imperial College London)

04/09/2025, 13:30

Standard Talk

Contributed talks

Graph Neural Networks (GNNs), particularly Interaction Networks (INs), have shown exceptional performance for jet tagging at the CERN High-Luminosity Large Hadron Collider. However, their computational complexity and irregular memory access patterns pose significant challenges for deployment on FPGAs in hardware trigger systems, where strict latency and resource constraints apply.

In this...

117. Smartpixels: Intelligent pixel detectors: Towards a radiation hard ASIC with on-chip machine learning in 28nm CMOS

Benjamin Weiss (Cornell University), Jannicke Pearkes (University of Colorado Boulder (US))

04/09/2025, 13:50

Standard Talk

Contributed talks

The Smartpixels project is a coordinated effort to co-design pixel ASICs, design tools, ML algorithms, and sensors for on-detector data reduction, motivated by the technical challenges of current and future colliders. The drive to greater precision requires smaller pixel pitch, which together with higher event rates arising from pileup and/or beam-induced background generates petabytes of data...

96. Quantum-Inspired Tensor Network Models for Ultrafast Jet Tagging on FPGAs

Ms Ema Puljak (universitat Autònoma de Barcelona)

04/09/2025, 14:10

Standard Talk

Contributed talks

We conduct a systematic study of quantum-inspired Tensor Network (TN) models—Matrix Product States (MPS) and Tree Tensor Networks (TTN)—for real-time jet tagging in high-energy physics, with a focus on low-latency deployment on FPGAs. Motivated by the strict computational demands of the HL-LHC Level-1 Trigger system, we explore TN architectures as compact and interpretable alternatives to deep...

88. Toward a Ultra-Fast, Energy-Efficient Readout of Calorimeters with Neuromorphic Processing

Enrico Lupi (CERN, INFN Padova (IT))

04/09/2025, 14:30

Standard Talk

Contributed talks

Hadronic calorimeters are a key part of high energy physics experiments. Traditionally, they rely on high granularity to improve performances, but this leads to various challenges in terms of cost, energy consumption and output data volume. Moreover, current detectors do not have the capability of exploiting temporal information of the shower development, as the time frame for pattern...

79. SparsePixels: Efficient Convolution for Sparse Data on FPGAs

Ho-Fung Tsoi (University of Pennsylvania)

04/09/2025, 14:50

Standard Talk

Contributed talks

Inference of standard convolutional neural networks (CNNs) on FPGAs often incurs high latency and long initiation intervals due to the nested loops required to slide filters across the full input, especially when the input dimensions are large. However, in some datasets, meaningful signals may occupy only a small fraction of the input, say sometimes just a few percent of the total pixels or...

73. FPGA-accelerated ML for real-time RHEED inference

Abdelrahman Asem Elabd (University of Washington (US))

04/09/2025, 15:10

Standard Talk

Contributed talks

Reflection High-Energy Electron Diffraction (RHEED) is a common diffraction-based surface characterization technique for analyzing the properties of crystalline materials that are grown using a thin-film deposition technique like pulsed-laser deposition (PLD) or molecular-beam epitaxy (MBE). In this work, we design an FPGA-accelerated machine learning (ML) algorithm to perform real-time...

121. Low-latency Jet Tagging for HL-LHC Using Transformer Architectures

Lauri Antti Olavi Laatu (Imperial College (GB))

04/09/2025, 16:00

Standard Talk

Contributed talks

Transformers are the state-of-the-art model architectures and widely used in application areas of machine learning. However the performance of such architectures is less well explored in the ultra-low latency domains where deployment on FPGAs or ASICs is required. Such domains include the trigger and data acquisition systems of the LHC experiments.

We present a transformer-based algorithm...

21. Radiation-Hard, ML-Based, Low-Latency Compression for the LHCb ECAL Upgrade

Katya Govorkova (Massachusetts Inst. of Technology (US))

04/09/2025, 16:20

Standard Talk

Contributed talks

The LHCb Upgrade II will operate at a data rate of 200 Tb/s, requiring efficient real-time data reduction. A major challenge of this pipeline is the transfer of full timing information from the frontend Electromagnetic Calorimeter (ECAL) to the backend for processing, which is critical for resolving pile-up, background suppression, and enhancing energy resolution. Due to the data rate, full...

51. Chisel4ml: Using Chisel For Direct Circuit Implementation of Deeply Quantized Neural Networks

Jure Vreča

04/09/2025, 16:40

Poster

Contributed talks

We give an introduction to chisel4ml, a tool for generating direct circuit implementations of deeply quantized neural networks. It uses structural descriptions of deeply quantized neural networks in the form of Chisel generators. Chisel is a domain-specific language for designing synchronous digital circuits. It is a language embedded in Scala that offers a wealth of powerful features, such...

116. MLOps Pipeline for Continuous Deployment of Machine Learning Algorithms for HEP

Maciej Mikolaj Glowacki (CERN), Marius Köppel (ETH Zurich (CH))

04/09/2025, 17:00

Topical session

Birds-of-a-Feather

We present an MLOps-based approach for managing the end-to-end lifecycle of machine learning (ML) algorithms deployed on FPGAs in real-time trigger systems, as used in experiments such as CMS and ATLAS. The primary objective of this pipeline is to enable agile and robust responses to evolving detector and beam conditions by automating the collection of new training data, retraining and...

101. QONNX Birds-of-a-Feather (BoF) Session

Yaman Umuroglu

04/09/2025, 17:00

Topical session

Birds-of-a-Feather

QONNX (Quantized ONNX) serves as a shared input representation and frontend for several efficient inference projects, including FINN, chisel4ml and NN2FPGA. This birds-of-a-feather session would serve as a gathering point for the community to discuss recent developments and future plans for QONNX.

148. Fast inference with Decision Forests

Richard Stotz (Google Zurich)

05/09/2025, 09:00

Invited Talks

Invited talks

Decision Forests such as Random Forests and Gradient Boosted Trees are an effective and widely used class of models for machine learning, particularly for tabular data and forecasting. This talk covers the practical use and ongoing research on Decision Forests at Google. We provide a brief overview of decision forest modeling with a focus on novel split conditions. We will analyze their impact...

147. Efficient Graph neural networks at Google

Mathieu Guillame-Bert (Google Zurich)

05/09/2025, 09:20

Invited Talks

Invited talks

Graph Neural Networks (GNNs) are a powerful paradigm for Neural Net ML models to operate on relational data or data with structural information. This talk explores the practical use and ongoing research on GNN done at Google for industrial applications. We provide a brief overview of GNNs modeling, including GCNs, Graph Transformers, and geometric-aware models. Then we discuss a variety of...

54. A Real-Time GNN-based Clustering Algorithm for the Level 1 Calorimeter Trigger at Belle II

Isabel Haide (Karlsruhe Institute for Technology)

05/09/2025, 10:15

Standard Talk

Contributed talks

With increasing beam background levels at Belle II, which have already been observed due to the world-record instantaneous luminosities achieved by SuperKEKB and which are expected to rise further, an upgrade of the current Level 1 (L1) trigger algorithms is necessary to handle the evolving conditions. In this work, we present an upgraded L1 electromagnetic calorimeter trigger, based on Graph...

58. Experiences Deploying a Hybrid PVFinder Algorithm for Primary Vertex Reconstruction in LHCb’s GPU-Resident HLT1

Mohamed Elashri (University of Cincinnati)

05/09/2025, 10:35

Standard Talk

Contributed talks

The PVFinder algorithm employs a hybrid deep neural network (DNN) approach to reconstruct primary vertices (PVs) in proton-proton collisions at the LHC, addressing the complexities of high pile-up environments in LHCb and ATLAS experiments. By integrating fully connected layers with a UNet architecture, PVFinder’s end-to-end tracks-to-hist DNN processes charged track parameters to predict PV...

150. State space models for Project 8 event reconstruction

Hannah Binney

05/09/2025, 10:55

Contributed Talks

Standard Talk

Contributed talks

The Project 8 experiment aims to directly probe the neutrino mass by precisely measuring the energy spectrum of beta electrons emitted in the decay of tritium. The collaboration has pioneered the cyclotron radiation emission spectroscopy technique (CRES), which measures the energy of single electrons by detecting the cyclotron radiation they emit in a magnetic field. Traditional methods for...

163. Get your poster!

05/09/2025, 11:15

Contributed talks

155. Conference summary

Nhan Tran (Fermi National Accelerator Lab. (US))

05/09/2025, 11:16

Invited talks

156. Poster prizes and closing

Benjamin Ramhorst (ETH Zurich), Denis-Patrick Odagiu (ETH Zurich (CH)), Marius Köppel (ETH Zurich (CH))

05/09/2025, 11:55

Invited talks

158. Satellite event: hls4ml dev meeting

Benjamin Ramhorst (ETH Zurich), Jan-Frederik Schulte (Purdue University (US))

05/09/2025, 14:00

For minutes of the discussion, see https://indico.cern.ch/event/1586270/

140. A Hybrid Architecture for Real-Time Experiment Feedback in X-ray Microscopy Enabled by Edge and Cloud Computing

Felix Bachmair (Dectris Ltd.)

Posters

Poster

Posters and coffee

Ptychographic imaging generates high-resolution datasets at the cost of heavy computational complexity, limiting its use in real-time experimental decision-making. In this cross-institutional effort, we introduce a hybrid edge-to-cloud workflow that delivers fast feedback for ptychography experiments by combining a modern synchrotron beamline at Diamond Light Source I13-1, featuring an...

109. Acceleration of a Quantized LeNet-based IDS on FPGA using FINN

Ameth Thiam

Poster

Posters and coffee

Introduction and Context

With the rise of cyberattacks and the growing volume of network traffic, intrusion detection systems (IDS) must provide fast, accurate, and resource-efficient analysis. Traditional CPU- or GPU-based solutions often struggle to meet low-latency and low-power requirements, especially in embedded environments.

Integrating artificial intelligence, particularly...

44. AI-Enhanced Dual-Coated QCM Sensor with FPGA-Based Impedance Analysis for Real-Time Multi-Parameter Environmental Monitoring

N Ramakrishnan (Associate Professor, Monash University Malaysia)

Poster

Posters and coffee

Quartz Crystal Microbalance (QCM) sensors are renowned for their high sensitivity to mass changes, making them ideal for detecting environmental parameters such as relative humidity (RH) and ultraviolet (UV) radiation. In this work, we present an AI-driven, dual-sided coated QCM sensor integrated with advanced machine learning (ML) and implemented on a real-time hardware platform. This sensor...

137. AutoDeploy-HEP: An Intelligent Toolkit for ML Deployment on Heterogeneous Hardware in High-Energy Physics

MUSTOFA ABDULHAFIZ AHMED mustofa

Poster

Posters and coffee

Deploying ML models today requires deep expertise in both hardware and software optimization. It often involves laborious trial-and-error to determine the right combination of tools, techniques, and configurations. While industry and academia benefit from a wide array of deployment frameworks and automation tools, the High-Energy Physics (HEP) community still faces major challenges in adopting...

52. Chisel4ml: Direct Circuit Implementation of Deeply Quantized Neural Networks

Jure Vreča

Poster

Posters and coffee

Chisel4ml is a tool we developed for generating fast implementations of deeply quantized neural networks. The tool has a Python frontend and a Chisel backend. The Python frontend serves as an interface to the Python ecosystem for training neural networks. The Chisel backend consists of hardware generators written in the Chisel Hardware Construction Language. This is a language embedded in...

5. Coyote v2: Open-source Abstractions and Infrastructure for FPGAs

Benjamin Ramhorst (ETH Zurich), Gustavo Alonso (ETH Zurich), Maximilian Jakob Heer (ETH Zurich)

Tutorials

Authors:
Gustavo Alonso, Maximilian Jakob Heer, Benjamin Ramhorst
As Moore’s Law and Dennard Scaling reach their limits, computing is shifting toward heterogeneous hardware for large-scale data processing. Cloud vendors are deploying accelerators, like GPUs, DPUs, and FPGAs, to meet growing computational demands of ML and big data.

While FPGAs offer great flexibility and performance,...

23. Data-Driven Optimization of Chemical Vapor Deposition Using Machine Learning and Surrogate Modeling

Dr Raja Selvam

Poster

Posters and coffee

Chemical Vapor Deposition (CVD) optimization is critical for advancing thin-film quality and process efficiency in semiconductor and optoelectronic applications, yet traditional methods like Computational Fluid Dynamics (CFD) simulations and empirical tuning are often computationally intensive or lack adaptability. To address this challenge, this study presents a data-driven machine learning...

104. Dynamic Scheduling Support for Faster ML Inference in hls4ml

Mr Andrei Girjoaba (ETH Zurich)

Poster

Posters and coffee

FPGAs are performant and flexible microchips well-suited for experimental physics that efficiently run anomaly detection algorithms and identify potential new physical phenomena. However, FPGAs are not easy to program: A significant gap exists between the algorithms used to discover new physics and the low-level hardware description languages (HDLs) required to program FPGAs. To tackle the...

50. Efficient and Interpretable Transformers for Particle Physics

Abhijith Gandrakota (Fermi National Accelerator Lab. (US))

Poster

Posters and coffee

Transformers excel at modeling correlations in LHC collisions but incur high costs from quadratic attention. We analyze the Particle Transformer using attention maps and pair correlations on the (η,ϕ) plane, revealing that Particle Transformer attention maps learn traditional jet substructure observables. To improve efficiency we benchmark linear attention variants on JetClass and find that...

46. Efficient data movement for Machine Learning inference in heterogeneous CMS software

Davide Valsecchi (ETH Zurich (CH))

Poster

Posters and coffee

Efficient data processing using machine learning relies on heterogeneous computing approaches, but optimizing input and output data movements remains a challenge. In GPU-based workflows data already resides on GPU memory, but machine learning models requires the input and output data to be provided in specific tensor format, often requiring unnecessary copying outside of the GPU device and...

61. Electron/Positron - Proton Classification with AMS ECAL using Convolutional Vision Transformers

Berk Turk (Middle East Technical University (TR))

Poster

Posters and coffee

Alpha Magnetic Spectrometer (AMS-02) is a precision high-energy cosmic-ray experiment consisting of Transition Radiation Detector (TRD), Silicon Tracker, Magnet, Time of Flight (ToF), and Ring Imaging Cherenkov Detector (RICH), Anti-Coincidence Counter (ACC), and Electromagnetic Calorimeter (ECAL) on the ISS operating since 2011, and has collected more than 240 billion cosmic-ray events. Among...

77. Enhancing Transit Detection to Uncover Long-Period Exoplanets

David Degen (ETH Zurich, Queloz Group)

Poster

Posters and coffee

Small ($R<4\,\mathrm{R}_{\oplus}$), long-period ($30\,\mathrm{days}<P$) exoplanets with low equilibrium temperatures are an extremely interesting population, promising insights into planet formation, atmospheric chemistry and evolution, as well as habitability. However, for these planets, the current observing strategy of NASA's Transiting Exoplanet Survey Satellite (TESS) can only capture...

68. EveNet: Towards a Generalist Event Transformer for Unified Understanding and Generation of Collider Data

Yuan-Tang Chou (University of Washington (US))

Poster

Posters and coffee

With the increasing size of the machine learning (ML) model and vast datasets, the foundation model has transformed how we apply ML to solve real-world problems. Multimodal language models like chatGPT and Llama have expanded their capability to specialized tasks with common pre-train. Similarly, in high-energy physics (HEP), common tasks in the analysis face recurring challenges that demand...

57. Evolution of the oneAPI backend for hls4ml

Jovan Mitrevski (Fermi National Accelerator Lab. (US))

Poster

Posters and coffee

Since version 1.0, hls4ml has provided a oneAPI backend for Altera FPGAs, as an evolution of the backend that targeted Intel HLS. Some design choices will be presented here, including the use of pipes and task sequences to develop a dataflow-style architecture. The oneAPI framework, unlike the Intel HLS framework, also naturally supports an accelerator-style deployment. Using always-run...

125. Fast Adaptive Neural Control of Resonant Extraction at Fermilab

Maira Khan (Fermi National Accelerator Laboratory)

Poster

Posters and coffee

We present the development of a machine learning (ML) based regulation system for third-order resonant beam extraction in the Mu2e experiment at Fermilab. Classical and ML-based controllers have been optimized using semi-analytic simulations and evaluated in terms of regulation performance and training efficiency. We compare several controller architectures and discuss the integration of...

74. Fast Synthetic X-Ray Generation for AI Diagnostics: DALL·E vs. Stable Diffusion

RUKSHAK KAPOOR

Poster

Posters and coffee

Medical imaging is foundational to clinical diagnostics and biomedical research, enabling the identification and monitoring of a wide range of conditions—from pulmonary diseases to cancer. However, the development of high-performance AI diagnostic systems is often hampered by restricted access to large, diverse, and well-annotated imaging datasets. This limitation is particularly acute for...

70. Flexibly Equivariant Generative framework for Stochastic sub-grid Turbulence modeling

Sharvaree Vadgama (University of Amsterdam), Julia Balla (MIT), Ryley McConkey (MIT)

Poster

Posters and coffee

Introduction
Accurate climate prediction hinges on the ability to resolve multi-scale turbulent dynamics in the atmosphere and oceans [1]. An important mechanism of energy exchange between the ocean and the atmosphere is mesoscale turbulence, which contains motions of length scale $\mathcal{O}$(100 km). Two-layer quasi-geostrophic (QG) simulations [2] are a popular technique for...

55. FloatQuant: Arbitrary-Precision Minifloats in QONNX

Nicolo Ghielmetti (CERN), Yaman Umuroglu (AMD Research)

Poster

Posters and coffee

The rising popularity of large language models (LLMs) has led to a growing demand for efficient model deployment. In this context, the combination of post-training quantization (PTQ) and low-precision floating-point formats such as FP4, FP6 and FP8 has emerged as an important technique, allowing for rapid and accurate quantization with the ability to capture outlier values in LLMs without...

131. FPGA Toolchain Integration for CI/CD workflows

Andrew Whitbeck (Fermi National Accelerator Lab. (US)), Ben Hawks (Fermi National Accelerator Lab)

Poster

Posters and coffee

Modern development flows that use tooling for automated building, testing, and deployment of software are becoming the norm for large scale software and hardware projects. These flows offer quite a few advantages that make them desirable, but when attempting to implement them for projects that use FPGAs, some complications can arise when attempting to integrate them with traditional FPGA...

92. FPGA-Based Quantized Inception-Style Neural Networks for Enhancing Spatial Resolution in Electron Microscopy

Lorenzo Asfour (ETH Zurich (CH))

Standard Talk

Contributed talks

High-resolution electron microscopy generates large volumes of pixel detector data due to beam rates reaching $10^7$ to $10^{10}$ electrons per second directed at the sample. Of this data, only the electron entry point into the silicon detector prior to scattering is typically of interest for downstream analysis. Precise knowledge of these entry points is particularly important in electron...

66. FPGA-Optimized ML for Fast electron identification and pT regression with the CMS Phase-2 L1 trigger

Piero Viscone (CERN & University of Zurich (CH))

Poster

Posters and coffee

In preparation for the High Luminosity LHC (HL-LHC) run, the CMS experiment is developing a major upgrade of its Level-1 (L1) Trigger system, which will integrate high-granularity calorimeter data and real-time tracking using FPGA-based processors connected via a high-bandwidth optical network. A central challenge is the identification of electrons in a high pileup environment within strict...

28. Graph Neural Networks for Online Track Reconstruction using FPGAs at the Event Filter for Phase-II Upgrades for the ATLAS Experiment

ATLAS Collaboration

Poster

Posters and coffee

The High-Luminosity LHC (HL-LHC) will provide an order of magnitude increase in integrated luminosity and enhance the discovery reach for new phenomena. The increased pile-up foreseen during the HL-LHC necessitates major upgrades to the ATLAS detector and trigger. The Phase-II trigger will consist of two levels, a hardware-based Level-0 trigger and an Event Filter (EF) with tracking...

30. High Throughput FPGA Deployment of Distilled Deep Sets Networks for Jet Preselection in the High-Level Trigger

ATLAS TDAQ collaboration, Lucas Bezio (Universite de Geneve (CH))

Poster

Posters and coffee

Deep Sets-based neural networks are well-suited to learning from unordered, variable-length inputs such as particle tracks associated with jets. Their permutation-invariant structure makes them attractive for high-energy physics (HEP) applications where input ordering is ambiguous and throughput is a critical constraint. In this work, we explore the use of such architectures on...

64. High-Throughput Ghost Track Rejection with Deep Learning at LHCb

Jiahui Zhuo (Univ. of Valencia and CSIC (ES))

Poster

Posters and coffee

The LHCb experiment at CERN operates a fully software-based first-level trigger that processes 30 million collision events per second, with a data throughput of 4 TB/s. Real-time tracking—reconstructing particle trajectories from raw detector hits—is essential for selecting the most interesting events, but must be performed under tight latency and throughput constraints.
A key bottleneck in...

67. Highly Granular Quantization for CICADA

Abhishikth Mallampalli (University of Wisconsin Madison (US)), Lino Oscar Gerlach (Princeton University (US))

Invited Talks

Poster

Posters and coffee

The CICADA (Calorimeter Image Convolutional Anomaly Detection Algorithm) project aims to detect anomalous physics signatures without bias from theoretical models in proton-proton collisions at the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider. CICADA identifies anomalies in low-level calorimeter trigger data using a convolutional autoencoder, whose behavior is...

3. hls4ml tutorial

Benjamin Ramhorst (ETH Zurich)

Tutorials

In this tutorial, you will get familiar with the hls4ml library. This library converts pre-trained Machine Learning models into FPGA firmware, targeting extreme low-latency inference. You will learn techniques for model compression, including how to reduce the footprint of your model using state-of-the-art techniques such as quantization. Finally, you will learn how to synthesize your model...

49. Improving On-Chip Compression of High-Granularity Calorimeter Data with Conditional Autoencoders

Erdem Yigit Ertorer (Carnegie-Mellon University (US))

Poster

Posters and coffee

The Large Hadron Collider (LHC) will soon undergo a high-luminosity (HL) upgrade to improve future searches for new particles and to measure particle properties with increased precision. The upgrade is expected to provide a dataset ten times larger than the one currently available by the end of its data-taking period. The increased beam intensity will also increase the number of simultaneous...

136. Integrating Support for Google XLS in hls4ml

Andrei Girjoaba

Poster

Posters and coffee

As Moore’s Law comes to an end, domain-specific architectures (DSA) are considered the next direction for performance improvements in compute. Unfortunately, the development environment of DSAs falls short in comparison to that of general-purpose architectures (e.g., CPUs). The transition from general-purpose to DSA is hindered by the fact that software engineers lack the knowledge to...

138. Inverse Design for Femtosecond-Laser Photonic Surfaces with Direct Gradient Optimization

Dr Amine Haboub (Qatar Environment and Energy Research Institute)

Poster

Posters and coffee

The inverse design of photonic surfaces produced by high-throughput femtosecond laser processing is limited by a strongly non-linear, many-to-one mapping from laser parameters (power, speed, hatch spacing) to the resulting optical spectrum. Tandem Neural Networks (TNNs) mitigate this ill-posedness by pairing a forward surrogate with a separately trained inverse network, but they still rely on...

110. Locality-Sensitive Hashing-Based Efficient Point Transformer for Charged Particles Reconstruction

Yuan-Tang Chou (University of Washington (US))

Poster

Posters and coffee

Charge particle track reconstruction is the foundation of the collider experiments. Yet, it's also the most computationally expensive part of the particle reconstruction. The innovation in tracking reconstruction using graph neural networks (GNNs) has demonstrated a promising capability to address the computing challenges posed by the High-Luminosity LHC (HL-LHC) with Machine learning....

94. Loss Landscape Analysis for Reliable Quantized ML Models for Scientific Sensing

Tommaso Baldi (Scuola Superiore Sant'Anna), Dr Tran Nhan (Fermi National Accelerator Laboratory, Batavia, IL, USA)

Poster

Posters and coffee

In this paper, we propose a method to perform empirical analysis of the loss landscape of machine learning (ML) models. The method is applied to two ML models for scientific sensing, which necessitates quantization to be deployed and are subject to noise and perturbations due to experimental conditions.
Our method allows assessing the robustness of ML models to such effects as a function of...

132. MLCommons Science Benchmarks

Ben Hawks (Fermi National Accelerator Lab)

Poster

Posters and coffee

Benchmarks are a cornerstone of modern
machine learning practice, providing standardized eval-
uations that enable reproducibility, comparison, and
scientific progress. Yet, as AI systems — particularly
deep learning models — become increasingly dynamic,
traditional static benchmarking approaches are losing
their relevance. Models rapidly evolve in architecture,
scale, and capability;...

122. NomAD: Low-Latency Unsupervised Anomaly Detection for the ATLAS Trigger

Rajat Gupta (University of Pittsburgh (US))

Poster

Posters and coffee

We present NomAD (Nanosecond Anomaly Detection), a real-time anomaly detection algorithm designed for the ATLAS Level-1 Topological (L1Topo) trigger using unsupervised machine learning. The algorithm combines a Variational Autoencoder (VAE) with Boosted Decision Tree (BDT) regression to compress and distill deep learning inference into a firmware-compatible format for FPGAs. Trained on 2024...

20. Optimizing Reciprocal Human-Machine Learning Decisions

Gila Fruchter

Poster

Posters and coffee

Recent advances in machine learning have raised ethical concerns in both industry and academia regarding the uncontrollable diffusion of AI and the diminishing human capacity to oversee its impacts. These concerns underscore the need for regulatory and design approaches that maintain human oversight in AI-driven decision-making. Keeping humans in the loop is essential for auditing, fairness,...

34. Planetary systems architecture based on a conditional generative model

Sara Marques (UniBe)

Poster

Posters and coffee

Understanding the diversity and structure of planetary systems requires capturing not only the properties of individual planets but also the statistical relationships between planets within the same system and their interaction with the host star.

Traditional population synthesis models, such as the Bern model, provide physically motivated insights into these correlations, but their...

95. Porting MADGRAPH to FPGA Using High-Level Synthesis (HLS)

Hector Gutierrez Arance (Univ. of Valencia and CSIC (ES))

Poster

Posters and coffee

The escalating demand for data processing in particle physics research has spurred the exploration of novel technologies to enhance the efficiency and speed of calculations. This study presents the development of an implementation of MADGRAPH, a widely used tool in particle collision simulations, to FPGA using High-Level Synthesis. This research presents a proof of concept limited to a single,...

157. Poster & Coffee

Poster

Posters and coffee

16. Poster Session and Coffee

Poster

114. Real-Time Andean Cryo-Hydrology Intelligence: A Heterogeneous Computing Framework for Glacial Hazard Early Warning

Siwar Jose Basualdo Garcia

Poster

Posters and coffee

The accelerated retreat of tropical glaciers in the Peruvian Andes poses an imminent and catastrophic threat of Glacial Lake Outburst Floods (GLOF). These events can devastate downstream communities like Huaraz with warning times of less than 15 minutes [1]. Existing monitoring systems are inadequate for this challenge; optical satellite observations (e.g. Landsat, Sentinel-2) are frequently...

45. Real-Time GPU Kalman-Filter Tracking via Kernel Refactoring and INT8 Surrogates for High-Luminosity Colliders

Mr Hao-Chun Liang (Institute of Pioneer Semiconductor Innovation, National Yang Ming Chiao Tung University)

Poster

Posters and coffee

As the era of the High-Luminosity Large Hadron Collider (HL-LHC) approaches, the GPU-accelerated High-Level Trigger (HLT) of the CMS experiment faces a stringent requirement to reduce the Level-1 readout stream from 100 kHz to 5 kHz, a twenty-fold decrease essential to adhere to archival bandwidth constraints [[1][1]], [[2][2]]. Meeting this demand necessitates highly efficient real-time...

9. Registration

Social

89. Scientific Machine Learning for Symbolic Recovery of Relativistic Effects in Black Hole Orbits

Pothuraju Naveen Yadav (Delhi Technological University)

Poster

Posters and coffee

Simulating relativistic orbital dynamics around Schwarzschild black holes is essential for understanding general relativity and astrophysical phenomena like precession. Traditional numerical solvers face difficulty while dealing with noisy or sparse data, necessitating data-driven approaches. We develop a Scientific Machine Learning (SciML) framework to model orbital trajectories and...

124. State Space Models for Scientific Time Series Applications

Christina Reissel (Massachusetts Inst. of Technology (US)), Maira Khan (Fermi National Accelerator Laboratory)

Poster

Posters and coffee

We investigate the application of state space models (SSMs) to a diverse set of scientific time series tasks. In particular, we benchmark the performance of SSMs against a set of baseline neural networks across three domains: magnet quench prediction, gravitational wave signal classification (LIGO), and neural phase estimation. Our analysis evaluates both computational efficiency—quantified by...

106. TDSCAN : Trigger Distributed Spatial Convolution Area Network

Tanguy Dietrich

Poster

Posters and coffee

Cherenkov Telescope cameras stream about 1 Billion frames per seconds and are dominated by night-sky background, yet the γ-ray air-shower patterns of interest appear only occasionally.
Filtering is thus paramount for guaranteeing that science-grade data are recorded without saturating the downstream read-out.
In this work we present TDSCAN (Trigger Distributed Spatial Convolution Area...

91. TinyML-Based Early Detection of Pak Choi Leaf Diseases Using FPGA in Vertical Farming Environments

PURABI MAZUMDAR. (Centre for Research in Biotechnology for Agriculture, Universiti Malaya, Kuala Lumpur, Malaysia)

Poster

Posters and coffee

Pak choi (Brassica rapa subsp. chinensis) is a leafy green vegetable widely cultivated in vertical urban farming systems due to its rapid growth and high yield under compact, hydroponic setups. However, even in these controlled environments, crops remain susceptible to various diseases. Among the most common threats are fungal infections such as Alternaria leaf spot and powdery mildew, and...

40. Towards Fast and Interpretable Physics-Informed Learning: Second-Order Neurons and Mixed-Activation Networks

João Paulo De Souza Böger

Poster

Posters and coffee

Complex simulators are central to scientific research, forecasting, and real-world applications. However, they often require intensive computational resources and suffer from scalability issues — challenges amplified in the big data era. The APEX project addresses these limitations by designing novel efficient architectures for scientific simulators, exploring inductive biases, causal...

126. Towards Online Machine Learning in DUNE Data Acquisition

Olivia Dalager (Fermilab)

Poster

Posters and coffee

Processing the large volumes of data produced by liquid argon time projection chamber (LArTPC) experiments presents a significant challenge, especially those at the scale of the Deep Underground Neutrino Experiment (DUNE). This is a particular challenge when aiming to trigger on low-energy neutrinos from core-collapse supernovae, which are typically buried in a high-rate radiological...

26. Tracking for the next ATLAS event filter with GNNs on GPUs

ATLAS Collaboration

Poster

Posters and coffee

Graph Neural Networks (GNNs) have been in the focus of machine-learning-based track reconstruction for high-energy physics experiments during the last years. Within ATLAS, the GNN4ITk group has investigated this type of algorithm for track reconstruction at the High-Luminosity LHC (HL-LHC) using the future full-silicon Inner Tracker (ITk).

The Event Filter (EF) is part of the ATLAS Trigger...

4. Tutorial 2

Tutorials

6. Tutorial 4

Tutorials

7. Tutorial 5

Tutorials

8. Tutorial 6

Tutorials

130. wa-hls4ml: A Benchmark and Surrogate Models for hls4ml Resource and Latency Estimation

Ben Hawks (Fermi National Accelerator Lab)

Poster

Posters and coffee

As machine learning (ML) is increasingly implemented in hardware to address real-time challenges in scientific applications, the
development of advanced toolchains has significantly reduced the time required to iterate on various designs. These advancements have
solved major obstacles, but also exposed new challenges. For example, processes that were not previously considered bottlenecks,...

15. Welcome

Invited Talks

Invited talks

39. When Less is More: Optimizing Graph Neural Networks and Knowledge Distillation for Efficient Particle Reconstruction and Identification in LHCb's Next-Generation Calorimeter

Cilicia Uzziel Perez (La Salle, Ramon Llull University (ES))

Poster

Posters and coffee

Graph Neural Networks (GNNs) have become promising candidates for particle reconstruction and identification in high-energy physics, but their computational complexity makes them challenging to deploy in real-time data processing pipelines. In the next-generation LHCb calorimeter, detector hits—characterized by energy, position, and timing—can be naturally encoded as node features, with...

Choose timezone

Fast Machine Learning for Science Conference 2025

Local organisers