9–13 May 2022
CERN
Europe/Zurich timezone

Optimized Deep Learning Inference on High Level Trigger at the LHC: Computing time and Resource assessment

12 May 2022, 16:20
15m
4/3-006 - TH Conference Room (CERN)

4/3-006 - TH Conference Room

CERN

110
Show room on map
Short talk Workshop

Speaker

Syed Anwar Ul Hasan (Universita & INFN Pisa (IT))

Description

We present a study on latency and resource requirements for deep learning algorithms to run on a typical High Level Trigger computing farm at a high-pT LHC experiment at CERN. As a benchmark, we consider convolutional and graph autoencoders, developed to perform real-time anomaly detection on all the events entering the High Level Trigger (HLT) stage. The benchmark dataset consists of synthetic multijet events, simulated at a center-of-mass energy 13 TeV. Having in mind a next-generation heterogeneous computing farm powered with GPUs, we consider both optimized CPU and GPU inference, using hardware-specific optimization tools to meet the constraints of real-time processing at the LHC: ONNX runtime for CPU and NVIDIA TensorRT for GPUs. We observe O(msec) latency with different event batch sizes for both CPU- and GPU-based model inference with maximal gain seen at batch size of 1 (corresponding to the typical use case of event-parallelized HLT farms). We show that these optimized workflows offer significant savings with respect to native solutions (Tensorflow 2 and Keras) both in terms of time and computing resources.

Primary authors

Ms Kinga Anna Wozniak (CERN and University of Vienna) Dr Maurizio Pierini (CERN) Mr Pratik Jawahar (Worcester Polytechnic Institute and CERN) Syed Anwar Ul Hasan (Universita & INFN Pisa (IT)) Ms Nadezda Chernyavskaya (CERN)

Presentation materials