Speaker
Description
A lot of attention and care is dedicated to code and calibrations used for official data processing campaigns of experiments, such as event generation, simulation, reconstruction, or derivation. The same level of care should be dedicated to trained ML models deployed as part of the aforementioned data processing steps. Such entities should be easily findable, documented, versioned, and reproducible: where is this model coming from? How was it trained? On what datasets? All these questions ought to be easily answered to mitigate the risk of jeopardising the data processing, for example being unable to properly re-train a model critical for analysis (e.g. tau ID).
We have the opportunity to provide an experiment-agnostic bookkeeping system for trained inference models, combining technologies such as code distribution, data management, and design of generic metadata catalogues.
CERN group/ Experiment
EP-SFT
| Working area | Area 5: Infrastructure for AI Deployment |
|---|---|
| Project goals | A common solution for the bookkeeping of ML models and information about their training should be provided to LHC experiments to reinforce the sustainability and reproducibility of their workflows involving ML approaches. |
| Timeline | 3 years Y1: collect information about existing ML models and metadata bookkeeping systems in the LHC experiments, and potentially in other initiatives, capturing commonalities. Y2: Elaborate a design for the common product, also discussing with IT colleagues for what concerns the necessary infrastructure behind it. Provide a fist demonstrator, and test it with the migration of the content from the existing systems, while distilling tools to automate the future real migration Y3: manage the migration of the existing, if any, model bookkeeping systems, after agreeing with the stakeholders a sensible timeline |
| Available person power | 2 staff 1 Grad |
| Additional person power request | 1 grad |
| Is this an already ongoing activity? | No |