32th ROOT Parallelism, Performance and Programming Model

Europe/Zurich
4/S-030 (CERN)

4/S-030

CERN

30
Show room on map
Danilo Piparo (CERN)

Present: Lorenzo, Xavi, Enrico, Enric, Stefan, Guilherme, Philippe, Danilo, Pere, Axel

 

Using TDF to load data for ML

- Proposal: myNumpyArray = tdf.AsMatrix(["a", "b", "c"])

This would allow to feed any ML tool. We discuss the interface here, TDF allows to do that but it's not perfect from the programming model.

The benchmark demonstrates that it's worth going down that route since factors are available

- Pere: adding an automatic conversion to numpy from vector<T> in Pyroot

- Axel: can we try to read directly in the numpyarray in the ROOT C++

- Philippe: can numpy join non-continuous slabs of memory? Yes: https://docs.scipy.org/doc/numpy/reference/generated/numpy.ascontiguousarray.html

- Axel: can we be constructive and prevent users to fill memory and have progressively refreshed numpy arrays? Not at the moment but this is extremely interesting

 

Thoughts on the evolution of declarative analysis in ROOT

Requirements:

- Make TDF work better in Python.

- Custom actions

Axel: the name of the columns could be in Book and not in the helper? To be discussed, it's not a critical issue for the implementation

Axel: slide 10 , the lambda is a declarative lambda rather than a preprocessingLambda

Pere: this is an internal optimisation! Shouldn't we be able to absorb the define node into the histo node once we notice that only one consumer will use that quantity? True. We never looked into graph optimisation.

Philippe: Can TDF apply the same computation to different names? Yes

Axel: Static if a branch does not exist? This will ease MC and data analysis. We acknowledge the point, we cannot do it.

Open questions:

- Do we need a custom event loop?

- Do we need custom transformations?

 

 

 

There are minutes attached to this event. Show them.
    • 16:00 16:30
      Thoughts about evolution of declarative analysis 30m
      Speaker: Enrico Guiraud (CERN, University of Oldenburg (DE))
    • 16:30 17:00
      Different approaches to data ingestion in TMVA: programming model and performance 30m
      Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))