May 25 – 29, 2026
Chulalongkorn University
Asia/Bangkok timezone

Leveraging Inference as a Service technology for executing ML models by the Derived AOD production applications of the ATLAS experiment

May 28, 2026, 4:33 PM
18m
Chulalongkorn University

Chulalongkorn University

Oral Presentation Track 3 - Offline data processing Track 3 - Offline data processing

Speaker

Vakho Tsulaia (Lawrence Berkeley National Lab. (US))

Description

To address this challenge and prepare for the transition to large, resource-intensive ML models, we propose leveraging AthenaTriton for DAOD production, where these ML models are executed on dedicated computing resources. AthenaTriton is a tool for running ML inference as a service in Athena using the NVIDIA Triton server software.We discuss different deployment strategies for Triton servers across heterogeneous computing platforms, including WLCG sites and High Performance Computing centers. We present the results of measurements of various performance metrics, including network transfer rate and latency, as well as event processing throughput. Finally, we evaluate the scalability of the AthenaTriton approach as a function of computing resources, enabling data-driven optimization of future DAOD workflows and ensuring sustainable, efficient large-scale ML inference across the evolving ATLAS computing infrastructure, which will increasingly rely on shared computing resources like those provided by the American Science Cloud.

Authors

Fengping Hu (Purdue University (US)) Dr Giordon Holtsberg Stark (University of California,Santa Cruz (US)) Vakho Tsulaia (Lawrence Berkeley National Lab. (US)) Xiangyang Ju (Lawrence Berkeley National Lab. (US))

Presentation materials

There are no materials yet.