Madgraph5 GPU development
# Madgraph dev meeting Tue 09.07.2024
https://indico.cern.ch/event/1355156/
Present: SR, OM, AV (notes), ZW, DM
## H100 tests (SR ~20')
SR shows some slides
SR got an H100 through NextGen, shows specs, big machine
H100 has double the memory (80GB) as an A100 (40 GB)
And actually one machine has 8 H100s attached
OM: 64-thread warps?
SR: not sure
Slide 7, scan when changing the number of events in input_app.txt
AV: are you also changing the GPU grid size?
SR: no the GPU grid size is fixed here
SR: question for OM, does it make sense to increase numbers in input.txt?
AV: how does this change with channelids?
eg if you have 1000 diagrams this is now 1000 G jobs with 1 channel each
OM: typically will be 100 G jobs with 10 channels each
AV: is I/O affected?
better many jobs with many files with few events,
or few jobs with few files each with many events?
OM: probably does not make much difference
## Flamegraphs and lhapdf (DM ~20')
DM shows some slides
(AV: please attach also a pdf, in case the web site hosting changes)
DM compares flamegraphs of Madgraph with and without lhapdf.
In both cases, pdfs are used, but without lhapdf the internal pdf imolementation of madgraph is used.
The conclusion is that the internal pdf implementation in Madgraph is slower than lhapdf.
AV: then probably for future studies we should ignore the internal pdf and only use lhapdf
DM: yes this makes sense, this was the first time we did this study
SR: I guess the experiments use external lhapdf, maybe we overestimated the time spent in pdfs then
(if we estimated that using the internal pdf in Madgraph)
AV: can we also get improvements by improving HOW pdfs are used?
Sherpa got a factor 40, we cannot get that but maybe some small improvements are possible
OM: they were doing something wrong, and using a badly implemented feature that we do not use
SR: propose that DM continues the work (started with SH) to put pdf on gpu
DM: also propose to profile Madgraph with adaptiveperf by Maks
## Zenny
Refactoring code for reweighting
## Olivier
Discusses status. Three things in progress
- couplings
- warps
- gpucpp360
AV: lets not forget the segfaults and icolamp etc
OM: yes but focus on the future
AV: this was two weeks of hard work, lets not give it for granted...
AV: lets discuss the branches
- AV We agree that we aim to have everything in master? OM yes
- OM the warp stuff is master_june24 AV: should be gpucpp_june24, it is not now
AV: and gpucpp_warp? OM/SR: we can forget
- AV third one? gpucpp_360? OM: yes, with no master
OM but I need june24/warp stuff before merging360
[so idea is gpucpp is main, then june24 should go into it, 360 depends on june24]
DM: so idea is cudacpp as a plugin?
OM: yes
## Andrea
AV shows some slides
Clarification on branches: should have gpucpp_june24
Clarification on iconfig: already testing several channels, even in gpucpp?
To be discussed offline between AV and OM
## AOB
SR looking at coupling ordering. Have some WIP where stop stop was crashing.
Mainly in the python code of cudacpp and mg5amcnlo codegen
AV: never saw a stop stop crash, can you get a reproducer?
SR: three Gs, in one of them it is crashing
AV: thanks, so it is a different iconfig
Next meeting: 23 July?
SR/AV/DM should be ok
ZW will be absent
OM: easier 24-25 rather than 23
SR: then 30 July... ok looks better