58th ROOT Parallelism, Performance and Programming Model Meeting

Name: 58th ROOT Parallelism, Performance and Programming Model Meeting
Start: 2019-05-09T16:00:00+02:00
End: 2019-05-09T17:30:00+02:00
Location: CERN

Thursday 9 May 2019, 16:00 → 17:30 Europe/Zurich

4/S-030 (CERN)

4/S-030

CERN

Show room on map

Danilo Piparo (CERN), Enric Tejedor Saavedra (CERN), Stephan Hageboeck (CERN)

Hide

Participants

Vincenzo, Stephan, Stefan, Enrico, Guilherme, Massimiliano, Enric (on site)

Philippe (remote)

News
- Guilherme: debugging Andrea Rizzi crash, who is using nightlies and getting a crash with RDataFrame and RVecs. Issue identified: it is related to small buffer optimization and stack memory corruption. Now in the process of finding out what commit causes the issue. If it can't be fixed easily, we will need to remove the small buffer optimisation for 6.18 and continue debugging after the release.
- Vincenzo: first PyRDF release is out, many RDataFrame tutorials run with PyRDF, going to put it in the LCG releases.
- Stephan: new implementation of the Kahan summation, will be leveraged by RooFit.

RTensor - Programming and ownership model

- RTensor is needed for TMVA - will go to TVMA experimental
- Shape transformation: names follow NumPy conventions
- Slicing, slide 2: returned object has inner strides for indexing
* Last line might be confusing - reader needs to understand that these are ranges
* What about default values for the dimensions, so that not all need to be specified?
* What about passing a range object?
- Slide 3: share_ptr can help share the ownership of the views (new RTensors) on the same data, a la python
* Problem: I/O does not support share_ptrs
* Proposal: use share_ptr and define a custom deallocator?
* Proposal: RTensor and RTensorView, reshape returns RTensorView
* Check what xtensor is doing
- Stefan will have a look and we will come back to this in a future PPP meeting

HSF Examples with RDF

- HSF meetings: decided that it would be good to have benchmarks for analysis software. Stefan has produced benchmarks with RDataFrame and OpenData.
- NanoAOD: just scalars, good use case for RDF
* Question: Are there people doing analysis using MiniAOD too with RDF? Not that Stefan knows, more difficult because it has an object model (CMSSW needed).
* Example 1: explain parameters passed to the constructor of the histogram in the Histo1D action.
* Why subfolders? Should we put description and plot in the same folder as the code?
* Comment on difference between first and second task (scalar vs array per row, we do extra loop under the hood).
* There will be a comparison for the same tasks and different frameworks (e.g. jagged arrays)
* There could be examples of how to create a histogram model separately, via a constructor or via setters for its properties.
* Example 4 is either wrong or the documentation is outdated - check
* Example 5: now we have a helper for invariant masses, the example could be simplified
Also, either add a Define to have a selection on the charge or rename the helper function (compute_dimuon_masses) to something else. The loop of compute_dimuon_masses does not look very intuitive (behaviour of ROOT::VecOps::Combinations) - either explain Combinations or write a double for loop

* We can continue the review in the form of a Github review.

There are minutes attached to this event. Show them.

- 1
  
  RTensor - Programming- and ownership model
  
  Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))
  
  RTensor.pdf
- 2
  
  HSF DAWG Examples with RDF
  
  Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))
  
  Analysis benchmarks for RDataFrame.pdf

Choose timezone

58th ROOT Parallelism, Performance and Programming Model Meeting

4/S-030

CERN