15–18 Oct 2024
Purdue University
America/Indiana/Indianapolis timezone

IceSONIC - Network AI Inference on Coprocessors for IceCube Offline Processing

15 Oct 2024, 15:40
5m
Steward Center 306 (Third floor) (Purdue University)

Steward Center 306 (Third floor)

Purdue University

128 Memorial Mall Dr, West Lafayette, IN 47907
Lightning 5 min talk + poster Lighting talks

Speaker

Benedikt Riedel

Description

An Artificial Intelligence (AI) model will spend “90% of its lifetime in inference.”To fully utilize co-
processors, such as FPGAs or GPUs, for AI inference requires O(10) CPU cores to feed to work to the
coprocessors. Traditional data analysis pipelines will not be able to effectively and efficiently use
the coprocessors to their full potential. To allow for distributed access to coprocessors for AI infer-
ence workloads, the LHC’s Compact Muon Solenoid (CMS) experiment has developed the concept
of Services for Optimized Network Inference on Coprocessors (SONIC) using NVIDIA’s Triton In-
ference Servers. We have extended this concept for the IceCube Neutrino Observatory by deploying
NVIDIA’s Triton Inference Servers in local and external Kubernetes clusters, integrating an NVIDIA
Triton Client with IceCube’s data analysis framework, and deploying an OAuth2-based HTTP au-
thentication service in front of the Triton Inference Servers. We will describe the setup and our
experience adding this to IceCube’s offline processing system.

Focus areas MMA

Authors

Presentation materials