Madgraph5 Code Generation Workshop

Europe/Zurich
Virtual (Zoom)

Virtual (Zoom)

# Thu 16.09.2021

Olivier's code generation workshop
https://indico.cern.ch/event/1061524/

Notes from the workshop
Present: OM, SR, AV, DavidS, TaylorC, LaurenceF (partly), WalterH (partly), CarlVuosalo (partly)
Apologies: StephanH

Q/SR: ready for Fortran/C++ interface?
A/OM: for the moment we focus on standalone C++/CUDA

Q/AV: why call it aloha and madgraph, plan to package it separately?
A/OM: in the past aloha was packaged separately, then now all together
Q/AV: why CPPProcess in madgraph and not aloha?
A/OM: also aloha was initially just the replacesment of HELAS (that was hardcoded)

Q/AV: flags same for CPPProcess and HelAmps? eg O3 and fastmath
A/OM: actually ok O3/fastmath for HelAmps, but feel against it for CPPProcess (and certainly Parameters)

## Step1 plugin

We create a PLUGIN (which is easier than another option?).

Note that all exporters derive from madgraph/iolibs/export_v4.py
The Fortran classes (one for Powheg, one for MadWeight, one SA etc) is actually inside V4.
Note that SA and MadEvent are different (and there is a third MadEvent with some optimizations).

Q/AV: but so for C++/CUDA should we imagine we have different ones for SA and MadEvent eventually?...
A/OM: one difference is the multichannel weight, only needed for MadEvent
A/OM: another difference is tracking information of color flow, only in MadEvent, needed for parton shower
(just because PS is typically not done in the SA version)

Q/JMF: but so SA is what exactly, LO?
A/OM: goal is really for one single ME value (which then we have modified with a loop over MEs)
A/OM: also the SA version is used for instance in pythia if you want to use the pythia framework
(pythia drives the whole process, and then you just compute one ME at a atime using madgraph SA code)

Note that ProcessExporterFortran does not have an output (it is just a base class essentially).

Q/TC: for kokkos doe you suggest I start from c++?
A/OM: no maybe better start from the GPU version 'epoch2' (ProcessExporterGPU)

C/AV: personally plan to do code in c++/cuda, generate, have a diff script, iterate

Q/AV: why plugin tgz, is the gpu not in bazaar?
A/OM: it is a template for modifications (there is something similar in essentially ProcessExporterGPU)

Q/AV: do you have progress on 'epoch3' already? or we start assuming you did not
A/OM: my 'epoch3' work is very old, there is no vectorization, no other code for code generation
(only have some code for Fortran phase space integration, might merge it eventually)
(there is some classes etc in the code that were foreseen but are not used)

... try out step1 ourselves ...

C/AV: Note that gitlab epoch2 i salready a bit more recent, there are some diffs

## Step2 model handling

This is ALOHAWriterForGPU
And UFOModelConverterGPU

Note a big difference:
- the xxx functions are just copied
- the others are processed through model handling

The ixxx are here: aloha/template_files/gpu/helas.cu
Note i/o are fermions, s scalar, v vector, t tensor (spin 2).

Q/AV: there were also the ipzxxx etc, for mass 0, relevant for eemumu: also for ggttgg?
A/OM: no probably for LHC not relevant, even for light quarks, because the propagator FFV are called much more
(xxx once per external particle, ffv once per propagator)

Q/AV: just to be sure, if I wanted to use ipzxxx and not ixxxx in CPPProcess, can I automate this, or not foreseen?
A/OM: (did not want to go that deep, will comment later, how functions are called rather than how functions built)

There is a lot of relevant stuff in aloha/aloha_writers.py
This is where you can redefine + and * fror instance
C/AV: for vectorization this may not be needed, + and * are redefined themselves

Q/AV: if redefine a constant eg as vector 1 rather than 1, is it aloha? or ufo?
A/OM: normally whould be in aloha, not in ufo

Q/SR: the ffv namings come from where?
A/OM: from ufo, normally ffv means fermion fermion vector vertex, should all be fixed

Q/AV: can we mix together definition (cxtype V) and implementation (V =  a sum... )?
A/OM: yes, remove from get_declaration and add to define_expression
C/AV: with compiler maybe not relevant, but you never know
C/TC: yes also with kokkos I prefer to avoid it to make sure you have no kokkos constructor to 0...

## Step3 change function call

Q/AV: IF (if) I want to remove the ifdefs, is there an easy way to have a single writer for cpu and gpu
(can always have a hierarchy with very small overloads, but maybe something lighter?)
A/OM: maybe try "output standalone_andrea --mode=cpu" to distinguish between cpp and cuda
Q/SR: maybe could use this also for precision for instance?

OM: HelasCallWriter is a bit of a mess also for me, because there is a lot of caching
In particular hope you do not need to change generate_helas_call, very complex....

Q/AV: is this where helicity recycling is done?
A/OM: no it is done outside of this (but it could have been done here)

C/TC: note the cxtype(0,1) are now constants in kokkos
C/AV: also in  my cuda/c++ they are constants now

Q/AV to TC: for kokkos could you use fptype and cxtype and redefine them as kokkos types?
A/TC: yes this is something that I could do, I will look at something like this

## AOB

Q/TC: how do we add madevent etc? in cuda we have main with kernels and data, in kokkos it is hidden
A/AV: am planning something similar for cuda/c++, it cannot be in main, it must be encpasulated
(had also started to do some work on this but not finished)
(and note this is orthogonal to the code generation discussion)
A/OM: also defining the exact inputs and outputs, the interface which Stefan is asking for
C/AV: I would aim first for code generation with c++ main and then in a second step the fortran
A/SR: sounds good

Q/SR: can I propose that we try things first, but define a final structire later on?
A/OM: yes get hands dirty, then we decide later
A/AV: yes this is iterative, agree with this

Q/SR: also change development model, we change code generator and we
A/AV: suggest epochX, change/ccomit c++/cuda, backport to codegen, make sure it is the same, iterate
C/SR: ok sounds good, we have both committed code and code gen code

Discussion: code gen code where?
C/SR: have code gene code only in common (eg between cuda/c++ and kokkos)
C/AV: no have INITIALLY code gen code in each directory (master branch)
To be discussed.

 

There are minutes attached to this event. Show them.