17–21 Nov 2025
Europe/Madrid timezone

Enhancing CMS data analysis with Distributed RDF on a high-rate platform

19 Nov 2025, 10:16
1m

Speaker

Tommaso Diotalevi (Universita e INFN, Bologna (IT))

Description

A flexible and dynamic analysis environment, capable of efficiently accessing and processing distributed data and resources, is essential for High Energy Physics (HEP) in both current and future LHC operations. This contribution presents the development and evolution of a scalable analysis platform that combines open-source standards with the computing resources provided by the Italian National Center for “HPC, Big Data and Quantum Computing” (ICSC).
Its performance and scalability are assessed through a study of the CMS Drift Tubes (DT) muon detector performance in phase-space regions driven by analysis needs, leveraging the declarative and quasi-interactive framework of ROOT RDataFrame (RDF) with its distributed execution through Dask. Scaling and speed-up metrics are reported and discussed, highlighting the benefits of the new RDF-based approach with respect to the legacy serial workflow.

Author

Tommaso Diotalevi (Universita e INFN, Bologna (IT))

Presentation materials