Conference on Computing in High Energy and Nuclear Physics

Name: Conference on Computing in High Energy and Nuclear Physics
Start: 2024-10-19T08:00:00+02:00
End: 2024-10-25T18:30:00+02:00
Location: No location set

19–25 Oct 2024

Europe/Zurich timezone

Contact Program Chairs

chep2024-pc@cern.ch

Enhancements and resource optimisations for ATLAS use of HammerCloud

23 Oct 2024, 16:51

18m

Room 2.B (Conference Room)

Talk Track 4 - Distributed Computing Parallel (Track 4)

Alexander Lory (Ludwig Maximilians Universitat (DE))

HammerCloud (HC) is a framework for testing and benchmarking resources of the world wide LHC computing grid (WLCG). It tests the computing resources and the various components of distributed systems with workloads that can range from very simple functional tests to full-chain experiment workflows. This contribution concentrates on the ATLAS implementation, which makes extensive use of HC for monitoring global resources, and additionally, has implemented a mechanism to automatically exclude resources if certain critical tests fail. The auto-exclusion mechanism makes it possible to save resources by avoiding sending computationally intensive jobs to non-functioning clusters.
However, in some cases central errors of the distributed computing system lead to massive exclusions of otherwise well-functioning resources. A new feature improves the recovery after such mass-exclusion events. For the auto-exclusion mechanism to be effective and save resources, test jobs need to be sent at a sufficient frequency. This in turn also uses resources. In this contribution, we give an estimate of the total balance of resources of the auto-exclusion system and explore possible optimisations.
Individual services and scripts have been reorganised as part of a general overhaul including containerisation and the web interface has been given a facelift after more than 10 years of operation. This contribution summarises the work needed to get HC ready for the next decade.

Alexander Lory (Ludwig Maximilians Universitat (DE)) Benjamin Rottler (Albert Ludwigs Universitaet Freiburg (DE)) Guenter Duckeck (Ludwig Maximilians Universitat (DE)) Michael Boehler (Albert Ludwigs Universitaet Freiburg (DE))

CHEP24_HC.pdf

Conference on Computing in High Energy and Nuclear Physics

Contact Program Chairs

Enhancements and resource optimisations for ATLAS use of HammerCloud

Room 2.B (Conference Room)

Speaker

Description

Authors

Presentation materials

Choose timezone

Conference on Computing in High Energy and Nuclear Physics

Contact Program Chairs

Speaker

Description

Authors

Presentation materials