ROOT Team Meeting

Name: ROOT Team Meeting
Start: 2024-03-18T16:00:00+01:00
End: 2024-03-18T17:30:00+01:00
Location: CERN

Monday 18 Mar 2024, 16:00 → 17:30 Europe/Zurich

32/1-A24 (CERN)

32/1-A24

CERN

Show room on map

Danilo Piparo (CERN), Jakob Blomer (CERN)

Hide

News

No news

Meeting News

PPP

There will be a PPP this week.
[Monica]: Ruban - University of Amsterdam - Using GPU as a decompression accelerator next week.

Topics

ACAT2024:

Small venue (120 people).
Poster sessions squeezed into coffee breaks.
Performance metrics not represented in the same way.

Few talks related to RNTuple:

RNTuple in Athena: Comment: how to get involved in (the soon-to-start) RNTuple with ATLAS?
RNTupleInspector: Mention on the analyzer/linter got interest from the audience.
RNTuple analysis with RDF - Couple of questions on the performance.
HEP-CCE - It is important to have a full picture of what we're doing. Will bring tangible benefits.

Talks on ML:

Interest in Sofie.
Memory usage of the inference running with C++ code with TensorFlow code - we don't have the numbers, we should investigate more.
It could be interesting to reach out to and find if the community is using this to find the missing parts.

Chats

Liz:

Reassured that we are in collaboration for RNTuple support CNS EDMs.

Jim:

Discussion on compressions.
Birds of a feather session: HEP help-desk - similar to the aarchi model - an LLM trained on all root documentation and all the related information - provide a centralized place. The tool doesn't provide the answer but points to the answer.

Gordon:

Question about debuggability - make tools that don't need debugging.
RNTuple - asked him specifically - not going to work on it unless the experiment forces him to do so.

Natalie:

Discussion on HEPScore benchmarks.

Tomasso:

1) Varied snapshot:
- Process output file separately. Different cardinalities and can be worked separately.

[Philippe]: This requires some duplication of data.
[Florine]: If you have T3 index you can only write the varied parts and can work around it.
[Philippe]: We still have duplicated all the sparse data, but right.
[Vincenzo]: They want to reconstruct data in a generic way, with some performance issues. It's worth a try and worth benchmarking.

2) Objectification of NanoAOD inputs.

CMS can already do it with a tool called Bamboo - official request.
Danilo's solution: C++ classes to represent in memory (user-defined) or a thin layer that does graph generation.

[Jacob] Situation is much better than expected; there is a way to do it, but for CMS users, it's ok.
[Vincenzo] Bamboo is a whole framework on top of CMS. It's always a bargain.
[JonasH] Don't use objects if you need performance.

Impact of Cppy Upgrade:

There are not many.
Not duplicating code anymore.

Summary: In fact, Cppy has many parts python library, C++ extension, and wrapper around cling - this has root-meta. Synchronizing with upstream expect the part taken from ROOT meta and cling.

Implicit conversion of std:: string to Python string.

The only motivation of not doing this is Unicode conversion and safety.
If the users are explicitly type-checking, it might fail/crash.
[JonasH]: Check if it's Unicode, if not return bytes for non-Unicode stuff. For convenience if it's a Unicode I expect, I don't want it to crash.
[Aaron]: With different encoding there were some corner cases.
JonasR: A check at the python level would work, but some performance overhead.
[JonasH]: There must be a function to convert a character array to python string.
JonasR: In this case you have to also check if it is convertible first.
[JonasH]: You only have to correctly catch the error - there's a way to do with CPython as well. Nice thing is that the code down there [in the slide] also works.
JonasR Advocate for keeping bytes object or erroring out.
[Jacob]: Suggestion - Keep bytes, gives the largest usability surface.
JonasR: Keeping a standard string would be the same as upstream.
[Vincenzo]: I agree, if it can be upstream, there is no question we shouldn't do it. If not, we just have an extra patch. Instead of crashing, emit a warning or an error. and we could remove the patch.
JonasR: It crashes anyway, so nobody uses it anyway. It can be an error directly.
JonasR: It would be nice if it is consistent and not returning Unicode or bytes sometimes.
JonasR: It's either improving the error or returning the bytes objects directly. Prefer having a clear error (if it cannot convert to Unicode).
[Jacob]: Handling of the valid case is the problem.
[Philippe?]: In long term, there is not way of getting a non-Unicode in python.
[JonasH]: Not important to users because there is no way either.
[Vassil]: Is it only going to break Unicode case?
[Vincenzo]: Non-Unicode already crashes.
[Vassil]: So we're discussing a hypothetical case?
There is no difference in the working.

"Strict" memory policy

There's a flag in Cppy to change this memory policy.
Some bugs in current Cppy regarding this memory heuristics because it is not tested. In the future, we can think about synchronizing this policy.

[Jacob]: This is a bit strange this heuristics when you look from the C++ perspective.
JonasR: There are some void pointer cases in the early versions of pyroot. It would not be difficult to go to strict memory policy.
[Vassil]: One can implement an LLVM pass if we see delete. Some annotation will be useful on the interfaces.
JonasR: We can annotate at the python level but it would be difficult, it would be nice to do it at C++ level
[Vincenzo]: Solution - set memory policy to strict by default. It only applies to older parts of root. It seems to me that it is, a clang annotation is an overkill.
[Jacob]: What about third parties that depend on this case.
[Vincenzo]: Anything/framework based on older pyroot before a certain point is going to break.
[Vassil]: It is very important to add annotations - nullability and ownership. If this is going to be null, then don't call.
[JonasH]: If we the APIs want to make sure there are no null pointers it should be a reference.
[Vincenzo]: Clarification: There is already an existing infrastructure so that we don't need any extra work need to do make this happen?

No implicit conversion for char to null

char[] converted to Unicode string.
Convert buffer back to Unicode string use the function.
This is pretty reasonable.

JonasR If it's not null-terminated Cppy already knows it. You can work around that
[JonasH]: Do we do this right now?

JonasR: In root we have some unit test with char buffers that contain country code - had to add asterisks to make it work.

Performance:
- Just run the test but show that numbers. Compared the runtime with and without the upgrade, but it's basically the same.
JonasR: It could be that the implicit conversion isn't done anymore that might be the explanation for some improvement.
JonasR: We still have 2 months before release, we have time to fix it, if there are any.

Summary:
- memory - stick with current heuristics - with next release change.
- remove implicit conversion.

[JonasH]: Are all the corner cases solved?
JonasR: There's even less tests failing than before.
[Vincenzo]: For further developments - we only have a patched cling wrapper, anything else in the Cppy stack is the same?
JonasR: No, there are some changes and reverts. Implicit std namespace is an example, TString needs a custom converter. ROOT type alias is long64_t etc.
[Vincenzo]: We can do something similar to what we did for LLVM. Create a monorepo and replicate what we did.
JonasR: All the patches are in one directory. Everything is traced.
[Vassil]: Real problem is going to be in the backend, for IO we want one thing. It is going to converge.
../
{Discussions without converging on a solution for future upgrades}
../
[Jacob]: Question of handling future upgrades we move it to the next slot or another meeting.

[SKIPPED ROUNDTABLE]

Task for everyone: Check the problem of work and fill it up because there is a quarterly review coming up.

[Meeting Ended]

There are minutes attached to this event. Show them.

- 16:00 → 16:01
  
  Find notetaker 1m
- 16:01 → 16:05
  
  News 4m
- 16:05 → 16:10
  
  Shift handover 5m
- 16:10 → 16:20
  Meeting Summaries and Plans 10m
  - I/O
  - TMVA
  - RooFit
  - PPP
  - Planning / Godparents /...
  - LIM
- 17:00 → 17:25
  
  Round Table 25m
- 17:25 → 17:30
  
  A.O.B. 5m