Speaker
Ian Pang
Description
We present Aspen Open Jets, a dataset consisting of 170M unlabelled jets derived from the CMS Open Data 2016. We show how using this dataset in the context of pre-training a foundation model can reduce the need for expensive simulated datasets. The dataset includes event information, jet kinematics, jet tagging information, particle kinematics, displacement, charge, PID and PUPPI weights, and will be available for further use by the community.
Authors
Dr
Alexander Mück
(RWTH Aachen University)
Anna Maria Cecilia Hallin
(University of Hamburg)
Darius Faroughy
(University of Zurich)
David Shih
Gregor Kasieczka
(Hamburg University (DE))
Dr
Humberto Reyes-González
(RWTH Aachen)
Ian Pang
Joschka Birk
(Hamburg University (DE))
Luca Anzalone
Michael Kramer
(Rheinisch Westfaelische Tech. Hoch. (DE))
Oz Amram
(Fermi National Accelerator Lab. (US))