19–25 Oct 2024
Europe/Zurich timezone

HEPCloud Facility Operations at Fermilab—The First Six Years

24 Oct 2024, 17:27
18m
Room 2.B (Conference Room)

Room 2.B (Conference Room)

Talk Track 4 - Distributed Computing Parallel (Track 4)

Speaker

Nick Smith (Fermi National Accelerator Lab. (US))

Description

The HEPCloud Facility at Fermilab has now been in operation for six years. This facility is used to give a unified provisioning gateway to high performance computing centers, including NERSC, ORLF, and ALCF, other large supercomputers run by the NSF, and commercial clouds. HEPCloud delivers hundreds of millions of core-hours yearly for CMS. HEPCloud also serves other Fermilab experiments including DUNE, Mu2E, Muon g-2, and NOvA. In this paper we present the practical considerations of operating a distributed facility such as HEPCloud. We also mention some of the interesting research and development that HEPCloud has been used for including GPU-based machine learning inference servers, and tests of Quantum Computing.

Primary authors

Kyle Knoepfel (Fermi National Accelerator Laboratory) Nick Smith (Fermi National Accelerator Lab. (US)) Steven Timm (Fermi National Accelerator Lab. (US))

Presentation materials