11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

Beyond Language: Foundation Models for Collider Physics Data

12 Mar 2024, 12:30
20m
Lecture Hall 2 ( Charles B. Wang Center, Stony Brook University )

Lecture Hall 2

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Oral Track 2: Data Analysis - Algorithms and Tools Track 2: Data Analysis - Algorithms and Tools

Speaker

Anna Hallin (University of Hamburg)

Description

Foundation models have revolutionized natural language processing, demonstrating exceptional capabilities in handling sequential data. Their ability to generalize across tasks and datasets offers promising applications in high energy physics (HEP). However, collider physics data, unlike language, involves both continuous and discrete data types, including four-vectors, particle IDs, charges, etc. Additionally, the particles are permutation invariant, which is fundamentally different from natural language. To address these challenges, we investigate various embedding schemes and techniques that introduce physical biases into the framework. Our findings provide valuable insights into the incorporation of foundation models into the HEP domain.

Significance

Although foundation models are already widely used in NLP, there is still more research to be done on their application in HEP. Currently, the HEP community is primarily investigating ways to encode collider physics data such that it can serve as a basis for a variety of tasks. We provide studies and insights at this frontier with our work on jet physics.

Primary authors

Anna Hallin (University of Hamburg) Gregor Kasieczka (Hamburg University (DE)) Joschka Valentin Maria Birk

Presentation materials