Nov 4 – 8, 2024
LPNHE, Paris, France
Europe/Paris timezone

Introducing Aspen Open Jets: a real-world ML-ready dataset for jet physics

Nov 4, 2024, 3:10 PM
20m
LPNHE, Paris, France

LPNHE, Paris, France

Speaker

Ian Pang

Description

We present Aspen Open Jets, a dataset consisting of 170M unlabelled jets derived from the CMS Open Data 2016. We show how using this dataset in the context of pre-training a foundation model can reduce the need for expensive simulated datasets. The dataset includes event information, jet kinematics, jet tagging information, particle kinematics, displacement, charge, PID and PUPPI weights, and will be available for further use by the community.

Authors

Dr Alexander Mück (RWTH Aachen University) Anna Maria Cecilia Hallin (University of Hamburg) Darius Faroughy (University of Zurich) David Shih Gregor Kasieczka (Hamburg University (DE)) Dr Humberto Reyes-González (RWTH Aachen) Ian Pang Joschka Birk (Hamburg University (DE)) Luca Anzalone Michael Kramer (Rheinisch Westfaelische Tech. Hoch. (DE)) Oz Amram (Fermi National Accelerator Lab. (US))

Presentation materials