28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

NEEDLE: A Columnar Workflow Orchestrator for Large-Scale Neural Simulation-Based Inference in HEP

28 May 2026, 14:57

18m

MHMK 208

Oral Presentation Track 9 - Analysis software and workflows Track 9 - Analysis software and workflows

Kylian Schmidt (KIT - Karlsruhe Institute of Technology (DE))

Neural Simulation Based Inference (NSBI) has emerged as a powerful statistical inference methodology for large datasets with high-dimensional representations. NSBI methods rely on neural networks to estimate the underlying, multi-dimensional likelihood distributions of the data at a per-event level. This approach significantly improves the inference performance over classical binned approaches by circumventing the need for summary statistics. In practice, NSBI tools remain computationally expensive due to the per-event statistical inference step and the training of many large neural networks in order to prevent biases. Implementations of NSBI in High Energy Physics (HEP) analyses therefore require reliable orchestration on heterogeneous resources, high-throughput ingestion, and processing of variable-length event data.

The NEEDLE project aims to meet these demands by providing a flexible framework for distributed training on computing infrastructure alongside a toolbox of powerful NSBI methods. The framework reduces the operational cost associated with NSBI by implementing core features such as orchestration, data ingestion and experiment tracking. First, the orchestration is performed with a directed acyclic graph (DAG) workflow manager for the machine learning training and evaluation, versioning and experimentation life cycle. Second, models and datasets are tracked with pytorch Lightning, allowing for flexible and reproducible experiments. Finally, common HEP storage formats such as ROOT and parquet are read dynamically using dask-based libraries for optimal memory management.

In this contribution, we present the design principles, software architecture, and performance characteristics of the NEEDLE framework. This will be demonstrated for Neural Likelihood Ratio approaches on HEP open datasets.

Kylian Schmidt (KIT - Karlsruhe Institute of Technology (DE))

Felix Kahlhoefer (Karlsruhe Institute of Technology) Judith Katzy (DESY, HAMBURG) Levi Evans (Deutsches Elektronen-Synchrotron (DE)) Nicolo Trevisani (KIT - Karlsruhe Institute of Technology (DE)) Nino Kovacic (University of Zagreb (HR)) Stephen Jiggins (Deutsches Elektronen-Synchrotron (DE)) Ulrich Husemann (Karlsruhe Institute of Technology (DE))

NEEDLE_presentation_chep

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

NEEDLE: A Columnar Workflow Orchestrator for Large-Scale Neural Simulation-Based Inference in HEP

MHMK 208

Speaker

Description

Author

Co-authors

Presentation materials