6–10 Nov 2023
DESY
Europe/Zurich timezone

Masked particle modelling

6 Nov 2023, 17:45
15m
Main Auditorium (DESY)

Main Auditorium

DESY

Speaker

Mr Matthew Leigh (University of Geneva)

Description

The Bert pretraining paradigm has proven to be highly effective in many domains including natural language processing, image processing and biology. To apply the Bert paradigm the data needs to be described as a set of tokens, and each token needs to be labelled. To date the Bert paradigm has not been explored in the context of HEP. The samples that form the data used in HEP can be described as a set of particles (tokens) where each particle is represented as a continuous vector. We explore different approaches for discretising/labelling particles such that the Bert pretraining can be performed and demonstrate the utility of the resulting pretrained models on common downstream HEP tasks.

Authors

Johnny Raine (Universite de Geneve (CH)) Lukas Alexander Heinrich (Technische Universitat Munchen (DE)) Prof. Margarit Osadchy (University of Haifa) Mr Matthew Leigh (University of Geneva) Michael Kagan (SLAC National Accelerator Laboratory (US)) Samuel Byrne Klein (Universite de Geneve (CH)) Tobias Golling (Universite de Geneve (CH))

Presentation materials