Speaker
Description
Train foundation models on various tasks to using supervised and unsupervised techniques (e.g.,approach followed to train large LLMs like chatGPT)
- Jet level: Allow a jet algorithm to self-discover patterns and physics properties on unlabelled data to obtain a pre-trained model unbiased from the Monte-Carlo discrepancy. The final goal would be to obtain a Foundation Model for jets that can be finetuned for the different tasks with minimal data/MC disagreement, allowing for a better post-calibration performance.
- Event level: Develop an event view starting from local clusters in the detector, performing the Particle Flow reconstruction task and adapting this model to various tasks. Develop a set of task-specific algorithms from this main algorithm, through a tuning workflow that could run on a local cluster.
- Analysis level: Learn a high-level object-based event representation of CMS events, to provide a foundation model for data analysis that can be finetuned for better event selection or related analysis tasks such as object assignment.
Supervised and unsupervised approaches will be studied.
CERN group/ Experiment
EP-CMG
| Working area | Area 2: Optimal AI deployment for Online Data Processing |
|---|---|
| Project goals | redefine how events are reconstructed in HEP |
| Timeline | 3 years |
| Available person power | 0 |
| Additional person power request | 3 PhD students, 2 fellow |
| Is this an already ongoing activity? | No |
| Indicative hardware resources needs | O(TB) fast storage (SSD) + Many powerful GPUs (O(10-100) H100 or A100 GPUs with at least 6 months of training time per year) |