9-13 July 2018
Sofia, Bulgaria
Europe/Sofia timezone

RDataFrame: Easy Parallel ROOT Analysis at 100 Threads

10 Jul 2018, 11:45
Hall 9 (National Palace of Culture)

Hall 9

National Palace of Culture

presentation Track 6 – Machine learning and physics analysis T6 - Machine learning and physics analysis


Enrico Guiraud (CERN, University of Oldenburg (DE))


The Physics programmes of LHC Run III and HL-LHC challenge the HEP community. The volume of data to be handled is unprecedented at every step of the data processing chain: analysis is no exception.
First class analysis tools need to be provided to physicists which are easy to use, exploit the bleeding edge hardware technologies and allow to seamlessly express parallelism.
This contribution discusses the declarative analysis engine of ROOT, RDataFrame, and gives details about how it allows to profitably exploit commodity hardware as well as high-end servers and manycore accelerators thanks to the synergy with the existing parallelised ROOT components.
Real-life analyses of LHC experiments’ data expressed in terms of RDataFrame are presented, highlighting the programming model provided to express them in a concise and powerful way. The recent developments which make RDataFrame a lightweight data processing framework are described, for example callbacks and I/O capabilities.
The flexibility of RDataFrame and its ability to read data formats other than ROOT’s are characterised, as an example it is discussed how RDataFrame can directly read and analyze LHCb's raw data format MDF.

Primary authors

Danilo Piparo (CERN) Philippe Canal (Fermi National Accelerator Lab. (US)) Enrico Guiraud (CERN, University of Oldenburg (DE)) Xavier Valls Pla (University Jaume I (ES)) Gerardo Ganis (CERN) Guilherme Amadio (CERN) Axel Naumann (CERN) Enric Tejedor Saavedra (CERN)

Presentation Materials