EP R&D Software Working Group Meeting
→
Europe/Zurich
32/1-A24 (CERN)
Zoom Meeting ID
62893115044
Host
Graeme A Stewart
Alternative host
Andre Sailer
Useful links
Join via phone
Zoom URL
Software R&D Working Meeting Minutes
2024-11-20
Room: Graeme, Lucas, Alberto, Juan, Aurora, Juan, Peter, Vincenzo, Piyush, Mateusz, André, Anna, Pere
Remote: Swathi, Joshua, Danilo, Leonhard, Andi, Witek, Joanna, (Andrea Bocci, briefly)
News
For the EP R&D annual report, for any projects that are only starting now, be very brief and report on plans.
Heterogeneous frameworks
-
Q: What is GaudiHive?
- GaudiHive was the Gaudi answer for multi-threaded processing. GaudiHive is now part of Gaudi, just have to change the eventloop/scheduler to Hive types. This was done with oneTBB, which cannot do offloading to a GPU, so something that can also do the offloading would be beneficial.
-
Q: Task graph contains all the algorithm of three events?
- One graph for all events, for conditional choices we decide at runtime
- Q: How are the events scheduled?
- Super easy with task flow, they have a parallel pipeline?
- Q: Does this scale well? Do we have to know a prori how many events there are going to be processed?
- Do not have any data, only have a mockup at the moment, so cannot run over full runtime. In reality could run on all the data just specifying number of slots.
- Q: Describing extraction of information from experiment workflow. Do you already have a DAG in mind and see how workflows apply, or if workflows are boiled down into abstract terms?
- Extracting real-life information from the examples so they can be exercised by different scheduling approaches. DAGs stored in repository and used for both taskflow and julia-fwk
- Q: Taskflow, rarely using control flow for offline processing, conditionals mostly in the trigger. How much work would it be to use taskflow inside Gaudi replacing oneTBB?
- Using control flow for algorithmic dependency, e.g. ensure the writer is running last for Key4hep
- The plan is to write a taskflow scheduler for Gaudi, but major issue is control flow because taskflow is using different primitives
- Q: Why chose taskflow?
- Soon hope to provide benchmarks, but taskflow is well known in c++ circles, many github stars and according to the developers benchmarks it is awesome
- Q: Cuda offloading, others?
- At the moment just cuda, un-officially sycle, but not advertised or exposed in documentation.
- Q: What happens if you end up on a node without a cuda offloading possibility?
- Not checked, probably exception, but need to investigate. Seems like have to know how many GPUs one has at the start of program
- Q: Can the events be distributed in the beginning?
- Some nodes can be faster than others, so we maybe do not use all resources fully until processing is finished
There are minutes attached to this event.
Show them.