Madgraph5 GPU development - !! ATTENTION WEDNESDAY !!

Name: Madgraph5 GPU development - !! ATTENTION WEDNESDAY !!
Start: 2022-06-08T15:00:00+02:00
End: 2022-06-08T16:00:00+02:00
Location: Virtual (Zoom)

Wednesday 8 Jun 2022, 15:00 → 16:00 Europe/Zurich

Virtual (Zoom)

Videoconference

Madgraph5 GPU development

Zoom Meeting ID: 63368133283
Host: Stefan Roiser
Useful links: Join via phone
Zoom URL

Stefan Roiser

stefan.roiser@cern.ch

+41 75 4115334

Hide

# MGonGPU dev meeting Wed 08.06.2022

Present: SR, OM, AV (notes), NN, TC, CV, WH

## Taylor - skeleton of slides for ICHEP

Taylor shows a skeleton for the slides for ICHEP,
https://docs.google.com/presentation/d/1TDNqpHY5GBj1FLnYst0lwA9QfbmJ9JbmwF4yWn2ef4w/edit?usp=sharing

AV: would mention upfront that we have two lines of development cudacpp and abstraction layers
and would also mentionm that we plan to integrate with madevent
WH/NN: agree that we should compare with cuda
OM: agree that it is important to mention the madevent, just that we do not have enough time
TC: true we have little time but we can touch all of these subjects
AV: agree we can mention the main messages for all these things

OM: two main messages is
- abstraction layers are performant
- we have a way to integrate

AV: I would also strongly mention that lockstep processing is fit for event generation
WH: have metrics for that?
AV: for lockstep on vectorization and GPUs yes
OM: tru, but who do you want to convince?
AV: other generators team, sherpa, but also whizard koralw etc...

TC: choice of processes? eemumu and ggtt?
AV: would show one simple process (eemumu? ggtt?) and one complex process (ggttgg? ggttggg?)
NN: for abstraction layers we have all 5 processes, and we see that for eemumu asbtraction layers look better
while for complex processes direct cpp seems better (openMp)

AV: would mention several dimensions of speedup, vectorization, multithreading, GPU and multi-architecture port
By having only the 'maximum' it is difficult to describe separately the parts.
More in detail: comparing the max for GPU is ok, but for CPU it is difficult to disentangle MT and vectorization.
TC: agree, but we should give a message to the HEP community about usability of the layers
AV: yes but we can say we do not understand some thing syet (eg for MT and vectorization on CPU
TC: ok agree

NN shows some plots for thread sacling on CPU
AV do not understand those increasing above 2^8 (also, check.exe CL arguments should have no impact?)...
better understand the peak at 2^6 and then drops in overcommitting from OMP_NUM_THREADS

AV/TC long discussion about vectorization...
TC do we have in the code a specific use of vectorization?
AV yes this is the neppV in the code, but also the -mhaswell build flags
Maybe you can use the build flags in sycl and see if it gives any benefit from autovectorization?

TC another option is that in the talk we only show the GPU results?

Discussion on GPU, we seem to agree, all good.

Discussion on CPU, a bit more complex to decide what to show.
TC: there are two good discussions, usability and ability to exploit fully the hardware.

WH: agree two different discussions, GPU and CPU

SR: we could give two messages here, knowing that WLCG now are mainly CPU
For CPU we can say we are able to leverage vectorization
For GPU we can use the abstraction layers

TC: we also have some mpirun plots

NN: ggtt does not run on alpaka
AV: lets use eemumu as simple process and ggttgg as comple process

AV: stress that I am not convinced so far that we have any SIMD vectorization in abstraction layers
(at least not the code we have now)... if you could show that it would be great

## Andrea

Present some slides

OM: you can try to use eemumu which does not have color

OM: note that move to 340 is done, but this wil be within the "311 branch".
If you use the 311 branch of github, you see it says "madgraph 340"!

## Round table

OM: progressing on colors

TC/NN/WH: nta

CV: trying to compile sycl code and having some issues
What is a sycl compiler, do we need to build it ourselves? Is clang++ ok?
NN: it must tbe the sycl build of clang, they have a branch

CV: also submitted a pull request for the alpaka

## AOB

Next meeting? Mon 13 June agreed

TC: will try to get some plots in the slides by the end of the week
AV: can I just add some message to the slides?

There are minutes attached to this event. Show them.

- 15:00 → 15:10
  
  News 10m
- 15:10 → 15:30
  
  Topical discussion 20m
  
  20220608-MGonGPU-mad-cudacpp-AV-v001.pdf
  
  20220608-MGonGPU-mad-cudacpp-AV-v001.pptx
- 15:30 → 15:50
  
  Round table 20m
- 15:50 → 16:00
  
  AoB 10m

Choose timezone

Madgraph5 GPU development - !! ATTENTION WEDNESDAY !!

Virtual (Zoom)

Share this page

Direct link

Social networks

Calendaring