-
Stephan Hageboeck (CERN)25/05/2026, 13:45Track 9 - Analysis software and workflowsOral Presentation
ROOT's RDataFrame is a declarative analysis interface to define modern analysis workflows in C++ or Python, which are executed efficiently either locally using TBB, or in a distributed manner using Dask or Spark. Its seamless integration with TTree and RNTuple makes it an ideal tool for performant and space-efficient data analysis in HEP. This contribution will highlight recent and upcoming...
Go to contribution page -
Iason Krommydas (Rice University (US))25/05/2026, 14:03Track 9 - Analysis software and workflowsOral Presentation
The Coffea (Columnar Object Framework for Effective Analysis) framework continues to evolve as a cornerstone tool for high-energy physics data analysis, providing physicists with efficient, scalable solutions for processing complex event data. This talk presents the current status of Coffea, highlighting...
Go to contribution page -
Cedric Verstege (KIT - Karlsruhe Institute of Technology (DE))25/05/2026, 14:21Track 9 - Analysis software and workflowsOral Presentation
Efficient and reproducible analysis workflows are vital for large-scale Monte Carlo (MC) event studies in high-energy physics (HEP). We present MC-Run, a lightweight and scalable open-source tool designed to orchestrate complete MC production and analysis chains, from event generation to Rivet analyses and subsequent post-processing such as combination procedures and plotting. The framework is...
Go to contribution page -
Paolo Mastrandrea (Universita & INFN Pisa (IT))25/05/2026, 14:39Track 9 - Analysis software and workflowsOral Presentation
The software toolbox used for big data analysis is rapidly changing in the last years. The adoption of software design approaches able to exploit the new hardware architectures and increase code expressiveness plays a pivotal role in boosting both development and performance of sustainable data analysis.
The scientific collaborations in the field of High Energy Physics (e.g. the LHC...
Go to contribution page -
Torri Jeske25/05/2026, 14:57Track 9 - Analysis software and workflowsOral Presentation
Machine learning (ML) has proven to be incredibly useful in science and engineering, however, there exists a significant overhead for deployment and maintenance of ML models in real time operation. This is due to many different custom interfaces each complex facility may have, the conversions required between non standard data formats, and ML infrastructure required for continuous adaptation...
Go to contribution page -
Lukas Breitwieser (CERN)25/05/2026, 16:15Track 9 - Analysis software and workflowsOral Presentation
The hardware landscape in today's data centers is rapidly evolving, with access to GPUs becoming the standard rather than the exception. Currently, physics data analysis using RDataFrame is still limited to execution on multi-core CPUs and distributed systems.
To reduce the time to results and enhance energy efficiency, we are investigating the feasibility of accelerating physics analysis...
Go to contribution page -
Ianna Osborne (Princeton University)25/05/2026, 16:33Track 9 - Analysis software and workflowsOral Presentation
The computational demands of the High-Luminosity LHC (HL-LHC) necessitate a transition toward heterogeneous computing environments. While the Scikit-HEP ecosystem has historically leveraged NVIDIA GPUs through CUDA, the increasing deployment of AMD-based supercomputers requires a vendor-neutral approach to performance portability.
This contribution details the design and implementation of...
Go to contribution page -
Ianna Osborne (Princeton University)25/05/2026, 16:51Track 9 - Analysis software and workflowsOral Presentation
The upcoming high-luminosity era at the LHC (HL-LHC) aims to produce exabyte-scale datasets that will significantly increase opportunities for new physics discoveries at the energy frontier. At the same time, future analyses will be increasingly computationally demanding. Larger datasets, increased analysis complexity, and the widespread adaption of machine learning techniques in HEP will...
Go to contribution page -
Cameron Harris25/05/2026, 17:09Track 9 - Analysis software and workflowsOral Presentation
The FCCee b2Luigi Automated Reconstruction And Event processing (FLARE) package is an open source python based data workflow orchestration tool powered by b2luigi. FLARE automates the workflow of Monte Carlo (MC) generators inside the Key4HEP stack such as Whizard, MadGraph5_aMC@NLO, Pythia8 and Delphes. FLARE also automates the Future Circular Collider (FCC) Physics Analysis software...
Go to contribution page -
Alexander Heidelbach25/05/2026, 17:27Track 9 - Analysis software and workflowsOral Presentation
Workflow Management Systems (WMSs) provide essential infrastructure for organizing arbitrary sequences of tasks in a transparent, maintainable, and reproducible manner. The widely used Python-based WMS luigi enables the construction of complex workflows, offering built-in task dependency resolution, basic workflow visualization, and convenient command-line integration.
Go to contribution page
b2luigi is an extension... -
Maximilian Horzela (Georg August Universitaet Goettingen (DE))25/05/2026, 17:45Track 9 - Analysis software and workflowsOral Presentation
Modern high-energy physics (HEP) analyses rely on complex, multi-stage workflows combining heterogeneous software and distributed data. While individual analysis tools are well developed, their orchestration is typically ad hoc, leading to duplicated effort, inconsistent configurations, and limited reproducibility. Existing workflow systems based on static dependency graphs struggle to capture...
Go to contribution page -
Nicolas Poffley (CERN)26/05/2026, 13:45Track 9 - Analysis software and workflowsOral Presentation
Commissioned in 2022, the organised analysis system Hyperloop has been the
Go to contribution page
primary platform for analysis within ALICE. The system was developed to meet the demands of the upgraded ALICE detector for Run 3, where the data-taking rate capability was increased by two orders of magnitude. To support analysis on such large datasets, the ALICE distributed computing infrastructure was revised and... -
Dr Rahul Tiwary (Toshiko Yuasa Laboratory (TYL), KEK)26/05/2026, 14:03Track 9 - Analysis software and workflowsOral Presentation
The Full Event Interpretation (FEI) algorithm is a central component of the Belle II analysis framework, designed for the efficient and flexible reconstruction of exclusive B-meson decays. It performs a hierarchical reconstruction of hadronic and semileptonic final states, using multivariate classification techniques to tag one of the two B mesons produced in electronโpositron collisions. The...
Go to contribution page -
Alaettin Serhan Mete (Argonne National Laboratory (US))26/05/2026, 14:21Track 9 - Analysis software and workflowsOral Presentation
ATLAS has developed a ROOT RNTuple prototype within its Athena software, enabling read/write support for event data and in-file metadata. Using this implementation, ATLAS converted the publicly available Open Data, comprising multiple tens of terabytes of 2015โ2016 protonโproton collisions and associated Monte Carlo samples, from ROOT TTree to RNTuple in the official DAOD PHYSLITE format. The...
Go to contribution page -
Khawla Jaffel (National Institute of Chemical Physics and Biophysics (EE))26/05/2026, 14:39Track 9 - Analysis software and workflowsOral Presentation
One of the main challenges currently facing high energy particle physicists analyzing data from the Large Hadron Collider (LHC) at CERN is the unprecedented volume of both real data and simulated data that must be processed. This challenge is expected to intensify as the LHC enters its high luminosity phase, during which it is projected to deliver up to ten times more data than before. At the...
Go to contribution page -
Juraj Smiesko (CERN)26/05/2026, 14:57Track 9 - Analysis software and workflowsOral Presentation
The Future Circular Collider (FCC) project requires an analysis infrastructure capable of handling large simulated datasets while providing the flexibility needed for rapid detector optimization. We present FCCAnalyses, the flagship analysis framework for the FCC collaboration. Integrated within the Key4hep software stack, FCCAnalyses leverages ROOTโs RDataFrame to provide a declarative,...
Go to contribution page -
Silia Taider (CERN)26/05/2026, 16:15Track 9 - Analysis software and workflowsOral Presentation
Machine learning (ML) techniques are increasingly adopted in the High Energy Physics (HEP) field from large-scale production workflows to end-user data analysis. As such, we see datasets growing in size and complexity, making data loading a significant performance bottleneck, particularly when training workloads access large, distributed datasets with sparse ML reading patterns.
In HEP,...
Go to contribution page -
Borja Sevilla Sanjuan (La Salle, Ramon Llull University (ES))26/05/2026, 16:33Track 9 - Analysis software and workflowsOral Presentation
Flavour tagging (FT) is essential in heavy-flavour physics for determining the production flavour of neutral B mesons in time-dependent CP-violation and mixing parameter measurements, where it significantly impacts the sensitivity. For Run 3 of the LHC, the LHCb experiment has redesigned its FT strategy, exploiting recent advances in algorithm methodology and machine learning, including modern...
Go to contribution page -
Ting-Hsiang Hsu (National Taiwan University (TW))26/05/2026, 16:51Track 9 - Analysis software and workflowsOral Presentation
Precision studies of $\tau^+\tau^-$ production in $e^+e^-$ collisions at LEP provide a clean environment for investigating spin correlations and quantum information observables. In the DELPHI experiment, the process $e^+e^- \to Z \to \tau^+\tau^-$ is well measured, but reconstruction of the $\tau^+\tau^-$ rest frame is challenged by the presence of multiple neutrinos in the final state. This...
Go to contribution page -
Jingde Chen (Institute of High Energy Physics)26/05/2026, 17:09Track 9 - Analysis software and workflowsOral Presentation
While Foundation Models have revolutionized natural language processing and computer vision, their potential in high-energy physics remains underutilized. In this work, we introduce Bes3T, a Transformer-based Foundation Model tailored for BESIII data analysis, and present a publicly released benchmark Monte Carlo dataset comprising 100 distinct $\mathrm{J}/\psi$ decay channels. Bes3T employs a...
Go to contribution page -
Siyang Wu (Shandong University)26/05/2026, 17:27Track 9 - Analysis software and workflowsOral Presentation
Quantum Machine Learning (QML) is an advanced data analysis technique, which can detect data structures, building models to achieve data prediction, classification, or simulation, with less human intervention. However, for data analysis of high-energy physics (HEP) experiments, the practical viability of QML still remains a topic of debate, requiring more examples of real data analysis with...
Go to contribution page -
Cilicia Uzziel Perez (La Salle, Ramon Llull University (ES))26/05/2026, 17:45Track 9 - Analysis software and workflowsOral Presentation
We present a prototype Retrieval-Augmented Generation (RAG) and agentic LLM tool designed to accelerate and support high-energy physics analyses. As a case study, we applied the system to the published 2016 ฮb โ ฮฮณ Run 2 analysis. Reproducing legacy workflows is often slow and error-prone due to fragmented code, dispersed documentation, personnel turnover, and software evolution over multiple...
Go to contribution page -
Mohamed Aly (Princeton University (US)), Oksana Shadura (University of Nebraska Lincoln (US))27/05/2026, 13:45Track 9 - Analysis software and workflowsOral Presentation
The upcoming High-Luminosity Large Hadron Collider (HL-LHC) at CERN will deliver an unprecedented volume of data for High Energy Physics (HEP). This wealth of information offers significant opportunities for scientific discovery, but its scale challenges traditional analysis workflows. In this talk, we present CMS analysis pipelines being developed to meet HL-LHC demands. These pipelines build...
Go to contribution page -
Alexander Held (University of Wisconsin Madison (US)), Artur Cordeiro Oudot Choi (University of Washington (US))27/05/2026, 14:03Track 9 - Analysis software and workflowsOral Presentation
The last few years have seen a wide range of developments towards scalable solutions for end-user physics analysis to meet the upcoming HL-LHC computing challenges. The IRIS-HEP software institute has created projects in a โChallengeโ format to checkpoint the progress. The โAnalysis Grand Challengeโ probes analysis workflows and interfaces with a limited dataset size, while the โ200 Gbps...
Go to contribution page -
Artur Cordeiro Oudot Choi (University of Washington (US))27/05/2026, 14:21Track 9 - Analysis software and workflowsOral Presentation
As the HL-LHC prepares to produce increasingly large volumes of data, the need for efficient data extraction and access services is growing. To address this challenge, the ServiceX toolset was developed to connect user-level analysis workflows to remotely stored datasets. ServiceX functions as a query-based sample delivery system, where client requests trigger Kubernetes-distributed workloads...
Go to contribution page -
Dr Yu Hu (IHEP, CAS)27/05/2026, 14:39Track 9 - Analysis software and workflowsOral Presentation
The High Energy Photon Source (HEPS) is a fourth-generation, high-energy synchrotron radiation facility scheduled to enter its early operational and commissioning phases by the end of 2025. With its significantly enhanced photon brightness and detector performance, HEPS is expected to generate over 200 petabytes (PB) of experimental data annually across 14 beamlines in Phase I, with data...
Go to contribution page -
Dr Maxim Gonchar (Joint Institute for Nuclear Research)27/05/2026, 14:57Track 9 - Analysis software and workflowsOral Presentation
The Daya Bay Reactor Neutrino experiment has released its full dataset of neutrino interactions with the final-state neutron captured on gadolinium, collected during 9 years of operation. The dataset was complemented by a model of the experiment in Python and a few analysis examples, reproducing the final measurement of neutrino oscillation parameters...
Go to contribution page -
Jonas Hahnfeld (CERN & Goethe University Frankfurt)27/05/2026, 16:15Track 9 - Analysis software and workflowsOral Presentation
Many HEP analyses rely on histograms for statistical interpretation of the experimental data, and use them as data structures that can be computed with, in addition to their visual aspects. ROOTโs histogram package was developed in the 90โs and has been widely used during the past 30 years. Despite its success, the design starts to show limitations for modern analyses and the classes lack some...
Go to contribution page -
Manfred Peter Fackeldey (Princeton University (US))27/05/2026, 16:33Track 9 - Analysis software and workflowsOral Presentation
The community's adoption of Hist and boost-histogram, both part of the Scikit-HEP software stack, leads to increasingly frequent work with dense, high-dimensional histograms. These histograms become a memory bottleneck in modern large-scale high-energy physics (HEP) analyses because they become exceedingly large due to the cartesian product of all axes.
Go to contribution page
To solve this problem, we propose... -
Felix Philipp Zinn (Rheinisch Westfaelische Tech. Hoch. (DE))27/05/2026, 16:51Track 9 - Analysis software and workflowsOral Presentation
In high energy physics (HEP), the measurement of physical quantities often involves intricate data analysis workflows that include the application of kinematic cuts, event categorization, machine learning techniques, and data binning, followed by the setup of a statistical model. Each step in this process requires careful selection of parameters to optimize the outcome for statistical...
Go to contribution page -
Andrzej Novak (Massachusetts Inst. of Technology (US))27/05/2026, 17:09Track 9 - Analysis software and workflowsOral Presentation
Weakly-supervised methods in the CWoLa (Classification Without Labels) family enable anomaly searches without truth labels by training classifiers on proxy objectives in data. However, these approaches require high-purity control regions which place assumptions on the signal and in practice are difficult to obtain. In addition, many include a number of disjoint steps, making it difficult to...
Go to contribution page -
Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))27/05/2026, 17:27Track 9 - Analysis software and workflowsOral Presentation
We present GRAEP (Gradient-based End-to-End Physics Analysis), a JAX-based framework for building modular, end-to-end differentiable analysis pipelines in high-energy physics. The framework integrates tooling from the Scikit-HEP ecosystem and enables gradient-based optimisation across HEP analysis workflows. We demonstrate an end-to-end differentiable analysis applied to CMS Open Data,...
Go to contribution page -
Tom Runting (Imperial College (GB))27/05/2026, 17:45Track 9 - Analysis software and workflowsOral Presentation
The Combine tool [1] is a statistical analysis software package developed by the CMS Collaboration for performing measurements and searches in high-energy physics. Originally created for Higgs boson searches and their statistical combination, it has evolved into a comprehensive framework used in the majority of CMS analyses. Built on ROOT and RooFit [2], Combine provides a command-line...
Go to contribution page -
Eddie Mcgrady (University of Notre Dame (US))28/05/2026, 13:45Track 9 - Analysis software and workflowsOral Presentation
Neural Simulation-Based Inference (NSBI) is an analysis technique which leverages the output of trained deep neural networks (DNNs) to construct a surrogate likelihood ratio which can then be used for a binned or unbinned likelihood scan. These techniques have show some success when applied to analyses involving effective field theory (EFT) approaches, where it can be difficult to achieve...
Go to contribution page -
Jay Ajitbhai Sandesara (University of Wisconsin Madison (US))28/05/2026, 14:03Track 9 - Analysis software and workflowsOral Presentation
Neural Simulation-Based Inference (NSBI) is a family of emerging techniques that allow statistical inference using high-dimensional data, even when the exact likelihoods are analytically intractable. The techniques rely on leveraging deep learning to directly build likelihood-based or posterior-based inference models using high-dimensional information. By not relying on hand-crafted,...
Go to contribution page -
Ethan Lee28/05/2026, 14:21Track 9 - Analysis software and workflowsOral Presentation
Recent anomalies in flavour observables have motivated renewed interest in precision measurements of semileptonic $B$-meson decays as a probe of possible physics beyond the Standard Model. Extracting such effects often requires fitting complex, high-dimensional datasets in which traditional likelihood-based methods become computationally challenging or intractable. Simulation-based inference...
Go to contribution page -
Jonas Rembser (CERN)28/05/2026, 14:39Track 9 - Analysis software and workflowsOral Presentation
Neural Simulation-Based Inference (NSBI) enables efficient use of complex generative models in statistical analyses, outperforming template histogram methods in particular for high-dimensional problems. When augmented with gradient information, NSBI can both maximise sensitivity to new physics and reduce the required amount of simulation.
Go to contribution page
The integration of NSBI into established... -
Kylian Schmidt (KIT - Karlsruhe Institute of Technology (DE))28/05/2026, 14:57Track 9 - Analysis software and workflowsOral Presentation
Neural Simulation Based Inference (NSBI) has emerged as a powerful statistical inference methodology for large datasets with high-dimensional representations. NSBI methods rely on neural networks to estimate the underlying, multi-dimensional likelihood distributions of the data at a per-event level. This approach significantly improves the inference performance over classical binned approaches...
Go to contribution page -
Stephan Hageboeck (CERN)28/05/2026, 16:15Track 9 - Analysis software and workflowsOral Presentation
Two years before the start of the High-Luminosity LHC, the ROOT project will evolve to its 7th release cycle. This contribution will explain ROOT's release schedule, and discuss new features being developed for ROOT 7 such as RFile or a high-performance histogram package to support concurrent filling. ROOT 7 is also planned to introduce a change in ROOT's object ownership model, allowing for...
Go to contribution page -
Aaron Jomy (CERN)28/05/2026, 16:33Track 9 - Analysis software and workflowsOral Presentation
The ROOT Python interfaces are a cornerstone of HENP analysis workflows, enabling rapid development while retaining access to high-performance C++ code. In this contribution, we present a major upcoming update to the backend powering the dynamic C++ bindings generation, based on the new CppInterOp library.
Go to contribution page
For ROOT users, this migration translates directly into a better experience: faster... -
Yue Sun (The Institute of High Energy Physics of the Chinese Academy of Science)28/05/2026, 16:51Track 9 - Analysis software and workflowsOral Presentation
In High Energy Physics (HEP), the demand for high-quality andefficient code is essential for data processing and analysis. However, Large Language Models (LLMs), while proficient in general programming, exhibit significant inaccuracies when generating specialized HEP code, reflected in a high failure rate. At the same time, a more complex offline software system will be necessary to adapt to...
Go to contribution page -
Gordon Watts (University of Washington (US))28/05/2026, 17:09Track 9 - Analysis software and workflowsOral Presentation
Large Language Models (LLMs) can serve as connective elements within ATLAS analysis workflows, linking data-discovery utilities, columnar data-delivery systems, and analysis-level plotting frameworks. Building on earlier exploratory studies of LLM-generated plotting code, we now focus on an implementable architecture suitable for real use. The system is decomposed into reusable Model Context...
Go to contribution page -
Jonas Wurzinger (Technische Universitat Munchen (DE))28/05/2026, 17:27Track 9 - Analysis software and workflowsOral Presentation
Despite decades of searching for the true nature of dark matter, no compelling evidence of its particle nature has been found. Without this evidence, the targets of searches for new physics must be carefully re-evaluated in terms of their theoretical completeness and experimental relevance. Exploring high-dimensional parameter spaces, such as the 19-dimensional phenomenological Minimal...
Go to contribution page -
Ragansu Chakkappai (IJCLab-Orsay)28/05/2026, 17:45Track 9 - Analysis software and workflowsOral Presentation
In collider-based particle physics experiments, independent events are commonly represented as tabular datasets of high-level variables, an approach widely used in multivariate and machine learning analyses. Inspired by the success of foundation models in language and vision, recent developments have introduced tabular foundation models such as TabNet (Google), TabTransformer (Amazon), TABERT...
Go to contribution page
Choose timezone
Your profile timezone: