ROOT Parallelisation, Performance and Programming Model Meeting

Europe/Zurich
42-R-406 (CERN)

42-R-406

CERN

10
Show room on map
Danilo Piparo (CERN)
Description
CERN number: 71400 Extension: 109284483#
Videoconference Rooms
ROOT_Team_Meeting
Name
ROOT_Team_Meeting
Description
ROOT Team Meeting
Extension
109284483
Owner
Fons Rademakers
Auto-join URL
Useful links
Phone numbers

Present: philippe, jim, guilherme, pere, xavi, martin, enric, enrico

 

Decisions:

* We'll have ranges and they will be expressed with 3 numbers or a string

* In first approximation, no negative indices will be usable

* In first approximation, no range will be active in MT mode

 

New Actions

* Lorenzo, Danilo: merge prs guilherme once the tag is done
* Xavi: report on the numerical stability of the reduce procedure
* Enric, Xavi: investigate the reimplementation of TTreeProcessExecutorMT in terms of TThreadExecutor
* Xavi, Danilo: look into the implementation of the TThre...MT
* Danilo, Lorenzo: integrate PR2 of Xavi (#367) about vectorisation
* Xavi: try fma with Ofast before ruling that out
* Xavi, Danilo, Gerri: how to input the parameter to create the size of the partitions in MapReduce

Open Actions
* Enric, Danilo: assess the difficulty of enabling TTreeProcessorMT also in the non imt builds, absorbing the duality of the imt/non-imt code path at the level of the ttreeprocessor implementation rather than TDF.
* Danilo: integrate new ctors taking a TEntryList in the TDF
   - Make sure to throw an informative message if more than one tree is present in the file(s)
   --> This is pending until the creation of TEntryLists is addressed with an Action.
* Gerri, Enric, Xavi, Danilo: identify a chunking (packetising) procedure for MT, MP, collections and trees (perhaps respecting clusters' granularity a la TTreeProcessor)
* Xavi: Slide 7 2N debug nasty scaling obtained. Ideas:
   - With perf count the page faults (Pere's idea).
   - Artificially increase the work to see if the overhead is less.
   - Try only one worker.

Closed Actions
* Danilo: investigate the TProfiles in TDF --> Now implemented
* Danilo: integrate new ctors in the TDF
   - Make sure to throw an informative message if more than one tree is present in the file(s)
* Danilo: PR #364, integrate
   --> Obsolete: this will be handled differently, more as an internal optimisation of the MapReduce method
* Danilo: investigate usage of histos with buffers
   --> The present buffering system, active only when dealing with histograms created without axes limit, is analogous to the one of the histogram themselves and, in some respect, superior since it takes into account fills coming from collections.
* Xavi, Danilo: iron out the issue streaming the TSeq between processes
   --> Obsolete: related to item above
* Guillerme: report on the progress about the veccore integration and coherency with vc and vecgeom
  - Vecgeom now integrated in ROOT a PR will come as soon as the VC one is integrated
* Xavi: implement Kahan summation to debug the different results in the likelihood benchmark based on multiprocessing
* Xavi: deliver PR1 for vectorisation in math to be integrated in ROOT as well as the associated tests

Enrico:
* Working on the jitting of ttreereader values in the context of tdataframe with the valuable help of philippe
* Jim: perhaps we can offer a different function to get a portion of the dataset rather than the stride "I want 20% of the overall dataset". To be thought: what to do if this hangs from a filter.
* Pere: can we use the TEntryList as a filter? We could investigate the difference of the full dataset and the one defined by the entrylist
* Guilherme: why the string? We agree that we want it even if not with high priority
* Pere and all: Head why not two names: AsString and Print . We will factor out the code from TTreeScan. We need also to input the branch list
* Pere and all: Tail we can have it, we just need to be very clear in the documentation. We trade in performance for functionality, it's very handy.
* all: Range is a sort of filter with a flag which signals to the frame, which manages the loop, that it's satisfied.

 

There are minutes attached to this event. Show them.
    • 16:00 17:00
      Round table 1h
      • Progress on actions from last week
      • TDataFrame ranges (slides)