194th ROOT Parallelism, Performance and Programming Model Meeting

Europe/Zurich
32/S-C22 (CERN)

32/S-C22

CERN

17
Show room on map
Marta Czurylo (CERN), Vincenzo Eduardo Padulano (CERN)
Zoom Meeting ID
61666320058
Host
Marta Czurylo
Useful links
Join via phone
Zoom URL
  • [Stephan Hageboeck] Slides 16-17, could you just read the entire object instead of the data members?

    • Yes, that would work both for TTree and RNTuple. But I'm showing these examples explicitly to show some cases where an analysis works for TTree but doesn't for RNTuple.
  • [Giacomo Parolini] When you say std::array do you also mean C-style arrays?

    • Not considered so far, but it would be hard to enforce Rule 2 in that case...
  • [Giacomo Parolini] Technical challenge to implement RVec<std::vector>?

    • Not really, other than the proliferation of corner cases to support.
  • [Jonas Hahnfeld] Do you want these rules to be exhaustive? RNTuple already supports all the three rules and more, do you want to limit?

    • I like the idea of codifying the RDataFrame behaviour so it is clear, documented and easily explainable whenever a new issue/question arises. And yes, part of this means potentially limiting RNTuple capabilities when processed via RDataFrame
  • [Jonas Hahnfeld] I would propose that we leave everything that is now inconsistent as it is, modulo fixing certain RNTupleDS bugs. We should make sure that TTree-->RNTuple is compatible. Even though the inconsistencies in the combinations of sequence container types are not ideal to look at, I don't think we need to also ensure RNTuple-->TTree compatibility. Thus, I would propose to not codify these rules, and just find the happy path that fixes the FCC schema issue for TTree in a way that the same RDataFrame can also work with an RNTuple data source.

  • [Giacomo Parolini] I agree, and I think we should also document the limitations of the interaction between RDataFrame and the TTree data source.

Conclusion:

  • Do not codify the rules
  • Fix the original FCC schema issue for TTree, ensuring that the same RDataFrame code can work for an RNTuple data source with the same dataset schema.
  • Document limitations of the interaction between the TTree data source and RDataFrame.
There are minutes attached to this event. Show them.
    • 16:00 17:00
      Rules for I/O of sequence types in RDataFrame 1h
      Speaker: Dr Vincenzo Eduardo Padulano (CERN)