Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

19–25 Oct 2024
Europe/Zurich timezone

IceSONIC - Network AI Inference on Coprocessors for IceCube Offline Processing

WED 22
23 Oct 2024, 15:18
57m
Exhibition Hall

Exhibition Hall

Poster Track 4 - Distributed Computing Poster session

Speaker

Benedikt Riedel

Description

An Artificial Intelligence (AI) model will spend “90% of its lifetime in inference.” To fully utilize coprocessors, such as FPGAs or GPUs, for AI inference requires O(10) CPU cores to feed to work to the coprocessors. Traditional data analysis pipelines will not be able to effectively and efficiently use the coprocessors to their full potential. To allow for distributed access to coprocessors for AI inference workloads, the LHC’s Compact Muon Solenoid (CMS) experiment has developed the concept of Services for Optimized Network Inference on Coprocessors (SONIC) using NVIDIA’s Triton Inference Servers. We have extended this concept for the IceCube Neutrino Observatory by deploying NVIDIA’s Triton Inference Servers in local and external Kubernetes clusters, integrating an NVIDIA Triton Client with IceCube’s data analysis framework, and deploying an OAuth2-based HTTP authentication service in front of the Triton Inference Servers. We will describe the setup and our experience adding this to IceCube’s offline processing system.

Primary authors

Alec Sheperd (University of Wisconsin-Madison) Benedikt Riedel David Schultz (University of Wisconsin-Madison)

Presentation materials

There are no materials yet.