ACAT 2024

Name: ACAT 2024
Start: 2024-03-11T08:00:00-04:00
End: 2024-03-15T14:30:00-04:00
Location: Charles B. Wang Center, Stony Brook University

11–15 Mar 2024

Charles B. Wang Center, Stony Brook University

US/Eastern timezone

Contact

acat-loc2024@cern.ch

A Mechanism for Asynchronous Offloading in the Multithreaded Gaudi Event Processing Framework

13 Mar 2024, 15:10

20m

Theatre ( Charles B. Wang Center, Stony Brook University )

Theatre

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794

Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Beojan Stanislaus (Lawrence Berkeley National Lab. (US))

High Performance Computing resources are increasingly prominent in the plans of funding agencies, and the tendency of these resources is now to rely primarily on accelerators such as GPUs for the majority of their FLOPS. As a result, High Energy Physics experiments must make maximum use of these accelerators in our pipelines to ensure efficient use of the resources available to us.

The ATLAS and LHCb experiments share a common data processing architecture called Gaudi. In Gaudi, data processing workloads are ultimately split into units called Algorithms, and Gaudi uses a smart scheduler (the Avalanche scheduler) to schedule these Algorithms on a fixed pool of CPU threads managed by Intel’s TBB.

This is an architecture that efficiently fills the available CPU capacity provided the algorithms are primarily CPU-limited. However when the algorithms offload a large portion of their computational work to GPUs they can be left blocking a CPU thread, wasting precious core-time.

Here we present a prototype of an addition to this scheduler, which places such GPU-accelerated algorithms on a separate pool of dedicated threads. By making use of lightweight Boost Fibers, and the ability to suspend these fibers without suspending the underlying OS thread, we can run the GPU workload asynchronously, without blocking the thread. This allows more efficient use of the CPU resources, and where the work offloaded by a single Algorithm doesn’t fill the GPU resources available can also improve GPU-efficiency by making use of separate CUDA streams.

Significance

This work presents an addition to the Gaudi Avalanche scheduler which enables it to deal with GPU-accelerated algorithms in a CPU efficient manner.

Experiment context, if any	ATLAS, LHCb

Beojan Stanislaus (Lawrence Berkeley National Lab. (US)) Dr Charles Leggett (Lawrence Berkeley National Lab (US)) Julien Esseiva (Lawrence Berkeley National Lab. (US)) Paolo Calafiura (Lawrence Berkeley National Lab. (US)) Vakho Tsulaia (Lawrence Berkeley National Lab. (US)) Xiangyang Ju (Lawrence Berkeley National Lab. (US))

AsyncAlg.pdf

acc-alg.pdf

ACAT 2024

Contact

A Mechanism for Asynchronous Offloading in the Multithreaded Gaudi Event Processing Framework

Theatre

Charles B. Wang Center, Stony Brook University

Speaker

Description

Significance

Authors

Presentation materials

Peer reviewing

Paper

Choose timezone

ACAT 2024

Contact

Speaker

Description

Significance

Authors

Presentation materials

Peer reviewing

Paper