28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

Fair-Share Versus Opportunism in Multi-VO Environments: The Complexity of Job Slot Allocation at the RAL Tier-1

25 May 2026, 16:15

18m

MHMK 202

Oral Presentation Track 4 - Distributed computing Track 4 - Distributed computing

Dr Brij Kishor Jashal (Rutherford Appleton Laboratory)

Managing job-slot allocation in a multi-VO environment remains a persistent operational challenge for WLCG sites, particularly when each Virtual Organization (VO) employs distinct workload-management and scheduling behaviors. At the RAL Tier-1 (RAL-LCG2), more than a dozen VOs—including CMS, ATLAS, LHCb, and several smaller communities—compete for heterogeneous resources while relying on subtly different submission patterns, priority models, and pilot-job strategies. Ensuring that each VO receives its guaranteed share, while simultaneously enabling opportunistic exploitation of unused capacity, requires a balancing strategy that extends beyond static fair-share configurations.

This contribution examines the complexities associated with harmonising these competing requirements, focusing on the interactions between VO-specific schedulers and site-level controls. We discuss the divergent workload characteristics of the major LHC VOs—such as CMS’s high pilot turnover, ATLAS’s multi-queue depth behaviour, and LHCb’s latency-sensitive submission logic—and how these differences influence both resource contention and backfill opportunities.

We present operational experience from the RAL Tier-1 in tuning HTCondor and ARC-CE parameters to shape multi-VO throughput, including dynamic slot-partitioning strategies, negotiator policy refinements, and queue-level throttling designed to preserve fairness while maximising utilisation. Results illustrate that achieving balanced allocation is non-trivial: naïve configurations can lead to persistent starvation, inefficient backfill, or pathological pilot cycling. The study demonstrates that sustained high efficiency in a multi-VO environment requires continuous calibration of site-level scheduling knobs, coordinated with VO-specific workload characteristics, to deliver both fair-share compliance and robust opportunistic use of spare capacity.

Dr Brij Kishor Jashal (Rutherford Appleton Laboratory) Thomas Birkett

There are no materials yet.

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Fair-Share Versus Opportunism in Multi-VO Environments: The Complexity of Job Slot Allocation at the RAL Tier-1

MHMK 202

Speaker

Description

Authors

Presentation materials