79th ROOT Parallelism, Performance and Programming Model Meeting

Europe/Zurich
Online

Online

Enric Tejedor Saavedra (CERN), Stephan Hageboeck (CERN)
    • 16:00 17:00
      Optimized workflow for analyses with multiple RDataFrames 1h
      Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))

      Conclusion from the discussion

      1. We want RDF to be thread safe when multiple instances are run in parallel. This already opens up the performance gains seen in the presentation for expert users, who can use for example TThreadExecutor to realize a similar setup.

      2. On the long run we need categorization/tree based transformations in RDF. The ultimate goal should be that the users has only one RDF with all the computations to be done and we can take care of distributing the work as efficient as possible. This solves the case for RDF with local parallelization (multi threading) but also for PyRDF and spark as scheduler.

      Enrico and Stefan will propose a programming model for these new features and report back.