Speaker
Amine Lahouel
(CERN)
Description
Storage and versioning of models, especially when handling a large number (1k to 10k+) needs specialized services. Traceability of the published models back to their training executions and parameterization is essential to offer trust and reproducibility.
This initiative will build on the ongoing NGT effort of offering a centralized mlflow instance and extend it to the whole CERN community, as well automating record keeping. It builds on the existing CERN MLOps platform and tools, extending them with the additional metadata required and integrating tools such as DVC as appropriate for improved version control.
CERN group/ Experiment
IT-CD-PI
| Working area | Area 5: Infrastructure for AI Deployment |
|---|---|
| Project goals | Establish a centralized service for model storage, versioning and AI reproducibility |
| Timeline | 6 |
| Available person power | 0.3 STAFF (limited contract, 3 years NGT) |
| Additional person power request | 0.3 STAFF (extending existing contract and post NGT) |
| Is this an already ongoing activity? | Yes |