CMS Mg5amc@NLO integration

Name: CMS Mg5amc@NLO integration
Start: 2024-08-27T13:00:00+02:00
End: 2024-08-27T14:00:00+02:00
Location: CERN

Tuesday 27 Aug 2024, 13:00 → 14:00 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

64902271783

Zenny Wettersten

AVC support account, Stefan Roiser

Join via phone

Hide

Tue 27 Aug 2024 - Madgraph meeting with CMS
https://indico.cern.ch/event/1373475/

Present: SR, AV (took minutes), ZW, Jin, Robert, Sapta, OM (after 30 minutes)
Excused: OM (could join only later)

## Zenny / reweighting

ZW: have a framework working for cases with pdf but only a single ME amplitude
Now working on fixing the case with multiple ME amplitudes
Problem is in Zenny's reweighting code, not in cudacpp
For instance e+e- to e+e- or mu+mu- have different Feynman diagrams

## Jin

Jin shows his slides

Slide 6
- JC: only 16 cores with cuda, could we use 32?
AV: we have seen issues with both CPU memory and GPU memory
AV: but true if you were able to increase nb_core, then it would probably go faster
because we know that the bottleneck with CUDA MEs is the CPU non-ME part
- JC: also have many nodes with multiple GPUs, can we use these?
AV: two different problems, one is that cudacpp cannot use many GPUs in the same job, <=== todo
[after the meeting: AV commented in https://github.com/madgraph5/madgraph4gpu/issues/836]
two is that many jobs using one GPU each should be able to choose. <=== todo
[after the meeting: AV opened https://github.com/madgraph5/madgraph4gpu/issues/989]
AV: the second should be doable with an env variable
SR: set the env variable in the shell before launching the job

Slide 9
- JC would be nice if we could use many GPUs in a gridpack production eg for tt+3jets
AV: interesting, this is problem number three, change the python/bash of MG to send jobs to multiple GPUs <=== todo
[after the meeting: AV opened https://github.com/madgraph5/madgraph4gpu/issues/990]
AV: in general, a lot of tuning has to be done by users, but MG must provide some tuning hooks,
this is one example where a tuning hook (use many GPUs for gridpack production) is missing

Slide 10
- AV: question (due to my ignorance), are you using CMS specific settings that require high precision?
In my tests I have the impression that producing a gridpack takes me 1h in Fortran, not 24h
ZW: maybe check the cuts
JC: may depend on pdf? AV using default, which should be faster than lhapdf
SB/JC: maybe best compare card by card in runcards <=== todo

Slide 3
- SR: probably not possible to rewuest specific AVX512 through condor
JC: could go to low level condor and check if avx512
AV: unfortunately, note that AVX512 per se is not enough, should check if it has one or two FMA units
(typically Silver or Gold/Platinum Intel CPUs), and this is not even published in O/S variables,
will follow this up with WLCG anyway <=== todo [after the meeting: AV following up]

Other points
- SR: tuning the number of events produced in a single job is important for GPU, must put it higher
AV: thanks good point, is this hardcoded now?
OM: no should be already available in a runcard, will check <=== todo

Slides 11-13
- AV: very nice, but it would be nice to get results for event generation from DY+3j and tt+3j
JC: problem is that the gridpack production is slow
SR: with NextGen trigger we will hav every powerful machines with 200+ cores, but probably not for CHEP
AV: one aternative (for tests! not physics production yet!) could be my suggested multi-backend gridpacks,
you compile/build for all backends, then optimise vegas with cuda (fastest!), but can compare evgen with cpp/fortram
OM: yes technically possible, gridpack production language independent from language where generated
[after the meeting: this is https://github.com/madgraph5/madgraph4gpu/pull/948 but is very much in WIP, no progress]

Slide 14
- JC: it seems that multi-jet is not linearly scalable
OM: very strange
AV: could profile it... anyway seems to go in the right direction, throughput increases as you increase events

Discussion
- AV: so in general it looks like results are better than 2-3 weeks ago?
JC: yes there seem to
- SR: is CDR closed or not yet? deadline?
SB: not closed, so we can try to incoroporat esome of the CHEP results
SB: deadline was last week, but we can go on... we should keep in contact with Daniel
- SR to AV: you did a lot of improvements, would you plan to present them here in a next meeting?
AV: thanks, yes I can if people find it useful, I sent the dev slides to Sapta/Jin already
AV: anyway, we should discuss this afternoon first at the dev meeting and see wha will be merged

There are minutes attached to this event. Show them.

- 13:00 → 13:50
  
  Discussion 50m
  
  Speakers: Jin Choi (Seoul National University (KR)), Saptaparna Bhattacharya (Wayne State University (US))
  
  mg4gpu.status.240827.pdf

Choose timezone

CMS Mg5amc@NLO integration

513/1-024

CERN