11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

Seamless transition from TTree to RNTuple analysis with RDataFrame

11 Mar 2024, 17:30
20m
Theatre ( Charles B. Wang Center, Stony Brook University )

Theatre

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Marta Czurylo (CERN)

Description

As the High-Luminosity LHC era is approaching, the work on the next-generation ROOT I/O subsystem, embodied by the RNTuple, is advancing fast with demonstrated implementations of the LHC experiments' data models and clear performance improvements over the TTree. Part of the RNTuple development is to guarantee no change in the RDataFrame analysis flow despite the change in the underlying data format.
In this talk, we present integration of RNTuple and RDataFrame. The engine can process RNTuple datasets on a local machine, sequentially with one core or using implicit multithreading with multiple cores. Furthermore, RNTuple processing is also introduced in the distributed RDataFrame layer and benchmarked using SWAN, a web-based platform, to transparently offload analysis tasks to the CERN HTCondor pools. The new workflow is demonstrated using existing RDataFrame analyses on one or multiple nodes with no change in the API. One notable example is the t-tbar Analysis Grand Challenge benchmark, which is also used as a blueprint to showcase differences in performance of (distributed) execution with the two data formats.

Significance

LHC experiments are already involved in the process of testing and validating the next-generation ROOT I/O. ROOT will progressively fade out support for writing new datasets with TTree, so RNTuple will have a clear impact on future HEP computing workflows at many levels, from infrastructures to final analyses. This presentation demonstrates how the ROOT efforts go in the direction of making the transition as effortless as possible for the HEP users, while aligning with the experiments' expected computing challenges.

References

CHEP 2023 https://indico.jlab.org/event/459/contributions/11582/
ACAT 2022 https://indico.cern.ch/event/1106990/contributions/4998129/

Primary author

Co-authors

Andrii Falko Danilo Piparo (CERN) Enric Tejedor Saavedra (CERN) Enrico Guiraud (Princeton University, CERN) Jakob Blomer (CERN) Philippe Canal (Fermi National Accelerator Lab. (US)) Dr Vincenzo Eduardo Padulano (CERN)

Presentation materials