15–19 Sept 2025
CERN
Europe/Zurich timezone

An experiment-agnostic bookkeeping system for trained inference models

16 Sept 2025, 11:55
5m
40/S2-A01 - Salle Anderson (CERN)

40/S2-A01 - Salle Anderson

CERN

100
Show room on map
5. Infrastructure for AI Deployment Infrastructure for AI Deployment

Speaker

Danilo Piparo (CERN)

Description

A lot of attention and care is dedicated to code and calibrations used for official data processing campaigns of experiments, such as event generation, simulation, reconstruction, or derivation. The same level of care should be dedicated to trained ML models deployed as part of the aforementioned data processing steps. Such entities should be easily findable, documented, versioned, and reproducible: where is this model coming from? How was it trained? On what datasets? All these questions ought to be easily answered to mitigate the risk of jeopardising the data processing, for example being unable to properly re-train a model critical for analysis (e.g. tau ID).
We have the opportunity to provide an experiment-agnostic bookkeeping system for trained inference models, combining technologies such as code distribution, data management, and design of generic metadata catalogues.

CERN group/ Experiment

EP-SFT

Working area Area 5: Infrastructure for AI Deployment
Project goals A common solution for the bookkeeping of ML models and information about their training should be provided to LHC experiments to reinforce the sustainability and reproducibility of their workflows involving ML approaches.
Timeline 3 years Y1: collect information about existing ML models and metadata bookkeeping systems in the LHC experiments, and potentially in other initiatives, capturing commonalities. Y2: Elaborate a design for the common product, also discussing with IT colleagues for what concerns the necessary infrastructure behind it. Provide a fist demonstrator, and test it with the migration of the content from the existing systems, while distilling tools to automate the future real migration Y3: manage the migration of the existing, if any, model bookkeeping systems, after agreeing with the stakeholders a sensible timeline
Available person power 2 staff 1 Grad
Additional person power request 1 grad
Is this an already ongoing activity? No

Author

Presentation materials

There are no materials yet.