Madgraph5 GPU development

Name: Madgraph5 GPU development
Start: 2023-01-09T15:00:00+01:00
End: 2023-01-09T16:30:00+01:00
Location: CERN

Monday 9 Jan 2023, 15:00 → 16:30 Europe/Zurich

513/R-070 - Openlab Space (CERN)

513/R-070 - Openlab Space

CERN

Show room on map

Stefan Roiser

stefan.roiser@cern.ch

+41 75 4115334

Hide

# Madgraph4gpu dev meeting Mon 09.01.2023

Present: SR, SH, OM, AV, NN, TC, WH

Happy new year!

## Olivier

Working on a new branch where an external file can be used instead of runcard.
It will also be read at build time, but you can still change the default values at runtime.

## Taylor

Started some scalability tests on Polaris machine, 500 nodes each with 4 GPUs.
Used MPI to distribute to the RAM disks of the different nodes.
Seems to be nicely scalable on the number of events.
Will try a statically built executable, as otherwise all executables
access the same shared library on the central filesystem, causing locks.
OM: note we write a lot to /tmp, so better make sure that is on RAM too.

SH: reading the shared lib should only be at startup, should not be an issue in production
TC: correct
SH: could even do a first run just to heat up the file cache, and do measurements on a second run
OM: there are also some mechanisms in madgraph for distributing processes on many nodes

## Walter

Discussed with ATLAS susy conveners, there is one upcoming large production,
with a scan on the masses of two particles.
Tried to reproduce the LHE files for the processes, but got different files
even with the same random seeds (not the same comparing standalone madgraph and madgraph
in Athena, but this is all in Fortran)... need to do some debugging.
Got the same physics distributions, but not the same events.
Maybe Athena overrides the random numbers? (or builds differently? or uses different physics parameters?).
Will talk to Rae in Argonne (she was at the meeting we did with ATLAS few weeks ago).

SR: which process is this exactly?
WH: using a process that is not public yet, asked susy conveners, but will share it

## Nathan

At the last meeting had discussed using thrust in cuda devices on sycl.
Work in progress, will present results at the next meeting.

Also worked on vectorization using sycl vector intrinsics.
This uses the host vectorization or emulates it.
Seems promising but need more performance tests.
Will now be able to use different vectorization levels.

SR: did you discuss with Jorgen? He had found that there is a switch to use a single core.
NN: yes also found this out, will make some detailed plots
But please tell Jorgen to send me the detailed flag to do this

AV: just to be sure, you are concentrating on sycl and not kokkos?
TC: yes

## Stephan

Looking at computing max weight on the GPU

## Stefan

We have been accepted for one oral and one poster at CHEP.
Oral is madgraph and poste is Zenny's work.
AV: please follow up with TimSmith about CERN IT

## Andrea

Andrea shows some slides.

## AOB

Next meeting January 23rd.

There are minutes attached to this event. Show them.

- 15:00 → 15:10
  
  News 10m
- 15:10 → 15:30
  
  Topical discussion 20m
  
  20230109-MGonGPU-helcol-AV-v002.pdf
  
  20230109-MGonGPU-helcol-AV-v002.pptx
- 15:30 → 15:50
  
  Round table 20m
- 15:50 → 16:00
  
  AoB 10m