Madgraph5 GPU development

513/R-070 - Openlab Space (CERN)

513/R-070 - Openlab Space


Show room on map
Madgraph5 GPU development
Zoom Meeting ID
Stefan Roiser
Useful links
Join via phone
Zoom URL
Stefan Roiser

# Madgraph dev meeting Tue 23.05.2023


Present: SR, JT, SH, ZW, AV

Remote: OM, WH, CV, TC

Excused: NN


## Pre-meeting


SH to OM: now that we have some gridpacks, how many events are there per channel?

OM: crossx.html is disaplyed during launch, if you click on the cross section 

you see the channels and the numbers of events per channel.

Could also remove splitting of invocations into separate jobs per channel.

SR: change in the runcard?

OM: try refine_event_by_job


## Round table


TC: was at CHEP and met some of us! a lot of interest in GPUs

NN cannot be here today but he is following up on various sycl issues

Will try to switch to produce some documentation so that people can use the plugin on their own

Looked at some scaling results, gave one for CHEP with oversubscribed issues


TC: ask Olivier about thoughts on NLO, for instance we could focus only on V+jets multijet.

OM: discussed this in the general madgraph meeting.

We can link to four libraries in Madgraph, typically done in order

(then if all four fail we go to quadruple precision).

Probably best strategy is to focus on the ninja library,

which is C++ and we have good contactes to the author (Tiziano Peraro),

see and

Other three are openloops, collier, cuttools.

SR: any of those groups look into GPU acceleration?

OM: no probably not

TC: openloops do not seem interested in gpu

AV: still maintained? saw last commits in 2017... OM: yes definitely supported

AV: needs FORM? it says so on the site... OM: no do not need FORM when using it in madgraph


WH: need to leave soon, whats the status with susy?

AV: will discuss a few issues in the slides, not sure if this is one of them

OM: maybe some issues have been fixed in the meantime


OM: also have an issue in the tt_gg process, the cross section is smaller

when using the vectorized code than when using the older code


CV: shows one slide about resources at the CHTC center in Madison, would these be useful?

AV: the A100s definitely look useful! 

OM: less clear about P100 and RTX2080, probably not enough double precision?

SR: probably still useful to test there


CV: one of my colleagues discussed with SH at CHEP to run madgraph 

SH: yes this was a useful chat, gave her some details to run some gridpack pipelines


JT: worked on an abstraction layer to support both HIP and CUDA from cudacpp

Looks quite promising for the moment, need some AMD GPUs

OM: can you use LUMI?

JT: we had access but lost it, we asked again

OM: through madgraph I can certainly have access through Belgium, can ask for a project 

SR: if not a hassle, yes please go ahead

SR: we are also trying through openlab

TC: how does this compare to a portability framework like sycl?

JT: just a header supporting both HIP and CUDA


ZW: CHEP talk went well, people were interested in reweighting

Showed some results that I had produced the week before 

Now cleaning up the application so that it can be released

AV: silly question, why not PEPPER anymore?

ZW: it's that the Sherpa team now use PEPPER (TC: new name of BlockGen)

ZW: now called TREX


ZW: also looking at NLO and virtual contributions now

Next week will be in Lund and will discuss with Rikkert Frederix about this


SH: gave the talk at CHEP, people were interested in the results,

next question is always when can we use it,

and next next question is when can go beyond LO

OM: do not completely agree that cannot use LO,

for instance for BSM definitely LO is useful

ZW: tomorrow we have a meeting with CMS people to discuss using it


SH: also presented results on showing unweighting in GPU 

to further speed up fortran, but we agreed this morning to wait 

for the other patches in the pipeline

OM: actually there are changes that can be done also in the fortran,

they do not need to be done in the GPU, so could do that independently

SH: get a factor two from not having to write the events

AV: SH's MR is on madgrpah4gpu, is this changes in cudacpp or fortran?

SH: half half, the fortran changes are on one process,

these would need to be ported to code generation

However the implementation is only done in cudacpp,

if we want it also in Fortran we need to write it.

I can point you to what needs to be done in Fortran


SR: worked on two things

First, worked on python machinery for integrating the launch functionality

(touching makefiles and also the extra parameters that are needed). This is in a MR.

Second, generated gridpacks for CMS/Sapta and ATLAS/Zach,

people are playing with this at the moment.

OM: would put some parameters in, to avoid putting in input files

stuff that is common to all channels

SR: ok will change the MR


SR: was also at the mcnet workshop (OM was too)

Sapta made a nice advertisement for our work

OM: MLM also made very good and supportive statements


## AV slides, WIP and open issues for the release


In #630, OM checked iand saw that if vectorization is on there are issues

OM: blackbox is everything that is getting the scale for merging and matching

For instance ickkw=0 fixes this issue


In #616 maybe Zenny can fix the files manually for this process


## AOB


Next meeting June 6 (OM will probably be absent)



There are minutes attached to this event. Show them.