1–5 Sept 2025
ETH Zurich
Europe/Zurich timezone

SuperSONIC: Cloud-Native Infrastructure for ML Inferencing

2 Sept 2025, 16:40
20m
ETH Zurich

ETH Zurich

HIT E 51, Siemens Auditorium, ETH Zurich, Hönggerberg campus, 8093 Zurich, Switzerland
Standard Talk Contributed talks

Speaker

Yuan-Tang Chou (University of Washington (US))

Description

The rising computational demands of increasing data rates and complex machine learning (ML) algorithms in large-scale scientific experiments have driven the adoption of the Services for Optimized Network Inference on Coprocessors (SONIC) framework. SONIC accelerates ML inference by offloading tasks to local or remote coprocessors, optimizing resource utilization. Its portability across diverse hardware platforms improves data processing and model deployment efficiency in advanced research domains such as high-energy physics (HEP) and multi-messenger astrophysics (MMA). We developed SuperSONIC, a scalable server infrastructure for SONIC that enables the deployment of computationally intensive inference tasks, such as charged particle reconstruction, on Kubernetes clusters equipped with graphics processing units (GPUs). Leveraging NVIDIA’s Triton Inference Server, SuperSONIC decouples client workflows from server infrastructure, standardizing communication, improving throughput, and enabling robust load balancing and monitoring. SuperSonic has been successfully deployed in production environments, including the CMS and ATLAS experiments at CERN’s Large Hadron Collider, the IceCube Neutrino Observatory, and the LIGO gravitational-wave observatory. It offers a reusable, configurable framework that addresses cloud-native challenges and enhances the efficiency of accelerator-based inference across diverse scientific and industrial applications.

Authors

Dmitry Kondratyev (Purdue University (US)) Benedikt Riedel Yuan-Tang Chou (University of Washington (US)) Miles Cochran-Branson (University of Washington (US)) Noah Paladino (Massachusetts Inst. of Technology (US)) David Schultz (University of Wisconsin-Madison) Miaoyuan Liu (Purdue University (US)) Javier Mauricio Duarte (Univ. of California San Diego (US)) Philip Coleman Harris (Massachusetts Inst. of Technology (US)) Shih-Chieh Hsu (University of Washington Seattle (US))

Presentation materials