32th ROOT Parallelism, Performance and Programming Model

Name: 32th ROOT Parallelism, Performance and Programming Model
Start: 2018-02-22T16:00:00+01:00
End: 2018-02-22T17:45:00+01:00
Location: CERN

Thursday 22 Feb 2018, 16:00 → 17:45 Europe/Zurich

4/S-030 (CERN)

4/S-030

CERN

Show room on map

Danilo Piparo (CERN)

Hide

Present: Lorenzo, Xavi, Enrico, Enric, Stefan, Guilherme, Philippe, Danilo, Pere, Axel

Using TDF to load data for ML

- Proposal: myNumpyArray = tdf.AsMatrix(["a", "b", "c"])

This would allow to feed any ML tool. We discuss the interface here, TDF allows to do that but it's not perfect from the programming model.

The benchmark demonstrates that it's worth going down that route since factors are available

- Pere: adding an automatic conversion to numpy from vector<T> in Pyroot

- Axel: can we try to read directly in the numpyarray in the ROOT C++

- Philippe: can numpy join non-continuous slabs of memory? Yes: https://docs.scipy.org/doc/numpy/reference/generated/numpy.ascontiguousarray.html

- Axel: can we be constructive and prevent users to fill memory and have progressively refreshed numpy arrays? Not at the moment but this is extremely interesting

Thoughts on the evolution of declarative analysis in ROOT

Requirements:

- Make TDF work better in Python.

- Custom actions

Axel: the name of the columns could be in Book and not in the helper? To be discussed, it's not a critical issue for the implementation

Axel: slide 10 , the lambda is a declarative lambda rather than a preprocessingLambda

Pere: this is an internal optimisation! Shouldn't we be able to absorb the define node into the histo node once we notice that only one consumer will use that quantity? True. We never looked into graph optimisation.

Philippe: Can TDF apply the same computation to different names? Yes

Axel: Static if a branch does not exist? This will ease MC and data analysis. We acknowledge the point, we cannot do it.

Open questions:

- Do we need a custom event loop?

- Do we need custom transformations?

There are minutes attached to this event. Show them.

- 16:00 → 16:30
  
  Thoughts about evolution of declarative analysis 30m
  
  Speaker: Enrico Guiraud (CERN, University of Oldenburg (DE))
  
  TDF evolution.pdf
- 16:30 → 17:00
  
  Different approaches to data ingestion in TMVA: programming model and performance 30m
  
  Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))
  
  slides.pdf

Choose timezone

32th ROOT Parallelism, Performance and Programming Model

4/S-030

CERN

Share this page

Direct link

Social networks

Calendaring