Madgraph5 GPU development

Name: Madgraph5 GPU development
Start: 2024-09-17T16:00:00+02:00
End: 2024-09-17T17:00:00+02:00
Location: CERN

Tuesday 17 Sept 2024, 16:00 → 17:00 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

63816708295

Stefan Roiser

Join via phone

Hide

Madgraph on GPU dev meeting Tue 17 Sep 2024
https://indico.cern.ch/event/1355161/
Present: OM, SR, ZW, AV (notes), AT, TC

## AT

AT: CV cannnot join today, apologies

AT: very busy with university, had no time to make progress yet, will come back to you
OM: we can schedule a meeting also with SR and AV if you want
AT: that would be good

## ZW

(1) Fixed many things in reweighting, merged it recently
It is in my forks of madgraph4gpu

AV: can you make a presentation at some point?
ZW: yes absolutely
ZW does a short demo interactively
Generally tests this with l+l- to l+l- because there are multiple subprocesses
The cards need a 'change gpucpp True' to enable ZW's reweighting
Then in the script adds several parameter changes, and launch after each one of them

OM: quite impressive! and so much faster than the original version!

AV: very nice! can you write down some doc?
ZW: yes will do, and in two weeks will show a couple of slides

SR: code generation and everything else is ready?
ZW: yes everything is ready

(2) Also discussing with OM and Marco Zaro about NLO

AV: is the vectorizingNLO branch in mg5amcnlo your work? this appeared recently
ZW: yes this is what we are doing with MZ, "simulating SIMD"

SR: still struggling with bugs here?
ZW: no, with MZ we fixed the bugs we had

## SR

SR: Stumbled across a build failure #1004 with vector type refences
SR/AV: will look at it

SR: what I really wanted to do here was to look at going to many particles in final state
This is using c++11 features for pre instantiating templates into separate objects
AV: end result may be similar to what I did with helinl=l,
but very nice to have a different approach with c++11 features for templates

## TC

TC: not much to report
Nathan started a staff position so has many other committments
TC discussing with NN to get what he did into the repo

## AV

AV shows the slides attached.

Discussion about the options for packaging.
OM: option 3 is also interesting.
We could maintain a (new renamed cleaned up) madgraph4gpu that contains mg5amcnlo.
Then mg5amcnlo users would download a ~tarball of cudacpp from the madgraph4gpu repo, just like other plugins/models do now.
AV: very interesting discussion, you are convincing me that option 3 may be the easiest,
that with less work to be done with respect to now, and also does not preclude option 1 and 2.
OM: note, presently models/plugins just download the latest available, here we should be a bit more precise
AV: very nice, means we can have a specific mg5amcnlo commit as submodule in madgraph4gpu,
but then also a specific commit of madgraph4gpu to identify a ~tarball to download in mg5amcnlo,
this makes the bidirectional dependency better controlled.
Agreed: AV will look more at option 3 concrete scenarios, OM will look at creating a database of versions.

Discussion about the DY+4j preliminary results.
AV: first time we see DY+4j speedup from SIMD and it is quite nice
AV: This is why it would be useful to have the multi backend gridpacks, and also the profiling infrastructure.
Discussion: this profiling infrastructure is complementary to flamegraphs, both are very useful in different ways
(eg flamegraphs for first look in detail with no pre-categorization, instrumenting for systematic cuda/simd/fortran comparisons).

## OM

OM Worked on the points described by AV.

OM Working with a new student on reducing matrix1.f files.

OM Also there is a meeting at CERN in Feb 2025, will forward the email.

OM Checked mismatch of xsec between sde=1 and sde=2, did not find any real bug
Quite clear however that we should use sde=1

There are minutes attached to this event. Show them.

- 1
  
  News
- 2
  
  Topical discussion
  
  valassi-20240917-MGonGPU-v001.pdf
  
  valassi-20240917-MGonGPU-v001.pptx
- 3
  
  Round table
- 4
  
  AoB

Choose timezone

Madgraph5 GPU development

513/1-024

CERN