19–25 Oct 2024
Europe/Zurich timezone

AthenaTriton: A Tool for running Machine Learning Inference as a Service in Athena

21 Oct 2024, 13:48
18m
Room 1.B (Medium Hall B)

Room 1.B (Medium Hall B)

Talk Track 3 - Offline Computing Parallel (Track 3)

Speaker

Yuan-Tang Chou (University of Washington (US))

Description

Machine Learning (ML)-based algorithms play increasingly important roles in almost all aspects of the data analyses in ATLAS. Diverse ML models are used in detector simulations, event reconstructions, and data analyses. They are being deployed in the ATLAS software framework, Athena. The primary approach to perform ML inference in Athena is to use the ONNXRuntime. However, some ML models could not be converted to ONNXRuntime because certain ML operations, such as the MultiAggregation in pyG as of writing, are not supported. Furthermore, a scalable inference strategy that maximises the event processing throughput is needed to cope with the ever-increasing simulation and collision data. A key element in that strategy should be enabling these ML algorithms to run on coprocessors like GPUs because not all computing sites have coprocessors. To that end, we introduce AthenaTriton, a tool that runs ML inference as a service based on the NVIDIA Triton Inference Server. With AthenaTriton, we give Athena the capability to act as a Triton client that sends requests to a remote or local server that performs the model inference. We will present the AthenaTriton design and its scalability in running ML-based algorithms. We emphasise that AthenaTriton can be used in both online and offline computing.

Primary authors

Beojan Stanislaus (Lawrence Berkeley National Lab. (US)) Dr Charles Leggett (Lawrence Berkeley National Lab (US)) Haoran Zhao (University of Washington (US)) Julien Esseiva (Lawrence Berkeley National Lab. (US)) Paolo Calafiura (Lawrence Berkeley National Lab. (US)) Shih-Chieh Hsu (University of Washington Seattle (US)) Vakho Tsulaia (Lawrence Berkeley National Lab. (US)) Xiangyang Ju (Lawrence Berkeley National Lab. (US)) Yuan-Tang Chou (University of Washington (US))

Presentation materials