Speakers
Description
This proposal outlines a structured R&D programme to develop a standardised test-bed infrastructure for evaluating heterogeneous hardware solutions targeting Machine Learning (ML) model deployment in High Energy Physics (HEP) Trigger and Data Acquisition (TDAQ) systems, and using it to evaluate available hardware acceleration options. The test-bed will support benchmarking of co-processors including Graphics Processing Unit (GPU)s, Field Programmable Gate Array (FPGA)s and more exotic accelerators (Analog-AI, AI-Engines, etc.) within containerised environments, abstracting platform-specific dependencies and ensuring reproducibility. A metadata-driven evaluation framework will be implemented to extract key performance indicators such as latency, power consumption, and physics impact. Integration with HEP experiment simulation and triggering pipelines will allow context-specific performance assessments. The outcomes will provide data-driven guidance for the design and deployment of future TDAQ systems for the High Luminosity - LHC (HL-LHC) and beyond.
CERN group/ Experiment
CERN ATLAS Team
| Working area | Area 5: Infrastructure for AI Deployment |
|---|---|
| If Other, please specify | Evaluation of hardware |
| Project goals | - Development of specific distributed computing environments (ie. docker/apptainer containers) for accessing different acceleration hardware. - Building of a meta-data software framework for extracting parameters of interest from the ML model development and deployment. - Definition of comparison metrics among the different acceleration methods. - Environment integration within the HEP experiment simulation frameworks for accurate estimation of model performance in terms of physics sensitivity. - Publications on evaluations of the different hardware options and recommendations on the best environment for deployment within the experiments (i.e. suited for triggering, offline processing, etc.). |
| Timeline | Year 1: Hardware survey and environment abstraction Year 2: Metrics Framework and Hardware Prototyping Year 3: Evaluation, Optimisation and Recommendation |
| Available person power | 0.4 FTE |
| Additional person power request | 36 GRAP months, 36 TECH months |
| Is this an already ongoing activity? | No |
| Indicative hardware resources needs | Access to distributed computing resources and new hardware acceleration technologies via OpenLab or other sources |