Speaker
Description
Analysis of collision data often involves training deep learning classifiers on very specific tasks and in regions of phase-space where the training datasets have limited statistics. Models pre-trained on a larger, more generic, sample may already have a useful representation of collider data which can be leveraged by many independent downstream analysis tasks. We introduce a class of pre-trained neural network models that can be fine-tuned for specific collider event classification tasks. These models are based on graph neural network architecture and have been trained on a large dataset of diverse simulated collision events for various classification and regression tasks. Our findings demonstrate that when fine-tuned for a new analysis task, the pre-trained model can outperform a classification model directly trained for that specific task. This improvement is particularly significant when the training sample for the downstream analysis task has limited statistics. In several tests, the pre-trained model also exhibits faster convergence during training, offering the potential to reduce overall time and energy consumption in scenarios that require repeated model training. Additionally, we present studies on the similarity of representations between the pre-trained model and models directly trained for the final analysis tasks.
Track | Tagging (Classification) |
---|