Madgraph5 GPU development

513/R-070 - Openlab Space (CERN)

513/R-070 - Openlab Space


Show room on map
Madgraph5 GPU development
Zoom Meeting ID
Stefan Roiser
Useful links
Join via phone
Zoom URL
Stefan Roiser

# Tue 27.03.2023 Madgraph dev meeting

Present: SR, AV, ZW, JT, SH; OM, WH, NN, CV

## Round table

ZW: finishing program for event reweighting on GPU.
Working on the I/O part.
Reading one LHE file and write one new LHE file.
Typically 10k events (up to 1M for very large files).
The entire file is stored in memory after being read.
There can be several reweighting on each event.

JT: prepared containers for our software.
Just missing AMD ROC software now.
This is one single container with the dev environment for all possible backend.
Presently based on Rocky Linux 8 with Nvidia.
Using compilers from cvmfs (network mounted), such as gcc112.
SR: we may have issues on HPC centers, but let's see when it comes.

AV: a few points
- was on holiday one week
- gave an internal CERN IT presentation about the project
 (one specific point, we got a recommendation to mention CERN in the copyright)
- gave an openlab presentation
- circulated the first draft of the ACAT proceedings, got feeback from SR and TC
 (discussion: ok to add Aurora plot and also gcc/clang/icx plot)

SH: will contact OM to discuss again about unweighting on the GPU

SR1: discussed with IBM at the openlab workshop, we might get access to a Power10.
Also discussed with Nvidia at the workshop, we might include Madgraph in their benchmarking suite.
SR and AV also got in contact with Barcelona HPC, got access to some preliminary system before RISC-V.
OM on that line: could apply for some time in LUMI through Belgium.
SH: would be nice to do some automatic performance tests periodically run there
AV: good idea, a bit like we do in th ebenchmarking project.

SR2: having a look at generating an example for the CMS people. Also had a meeting with them.
Technically, could do two things, either via upstream madgraph or via madgraph4gpu,
discussed with AV and agreed to try the latter way first.
The problem in the former is that AV's patches over Fortran makefile are not there yet.
OM: ok with that but eventually need everything upstream. SR/AV: yes that's the idea!

OM1: worked on an issue reported by NN, that is essentially what AV was concerned about for years,
namely processes with internally two different matrix elements.
See the png screenshot now attached to indico, left is old situation, right is new situation.
Previously q was a quark or antiquark, both handled in the same directory,
now instead (ONLY if there is me_exporter! that's the only difference) they are two separate ones
NN: thanks, but ruan into another issue, will ask OM for help again

OM2: will not attend CHEP in the end (and they did not propose to give a remote talk)

NN: fixed some issues in sycl standalone, we hardcode channelid to 0 and this now causes some issues.
Could you hardcode it to 1 also in cudacpp? Otherwise something gets set to -1 and causes a hang in cuda.
AV: will have a look, please send me a

CV: ntr, will read the acat draft and send comments
Minor point, could you send reminders a coupld of days in advance please?

## AOB

Next meeting: Tue 11 April? Just after Easter Monday.
OM: will probably be unavailable 11 (and probably also 18), go ahead without me

AV: any plans for the Gargnano meeting?
OM: will probably give a presentation over many projects, with only 2 minutes for our stuff.
Might ask for a talk specifically on GPUs, but very early to say.
This is not at all a regular yearly meeting, we had one around 2019 and one in 2012...
We do have a regular monthly meeting but kind of administrative and release stuff
(technically open to everyone anyway!)







There are minutes attached to this event. Show them.