Speaker
Description
The Institute of High Energy Physics' computing platform includes isolated grid sites and local clusters. Grid sites manage grid jobs from international experiments, including ATLAS, CMS, LHCb, BELLEII, JUNO, while the local cluster concurrently processes data from experiments leading by IHEP like BES, JUNO, LHAASO. These resources have distinct configurations, such as network segments, file systems, and user namespaces etc.
The local cluster operates at a high job slot utilization rate, exceeding 95%, and still with the significant queuing. In contrast, grid site utilization is below 80%. To optimize resource use, we developed a model enabling worker nodes to handle both grid and local cluster jobs.
This involves preconfiguring the local cluster with container technology and initiating the local cluster's startd on grid nodes through glidein. Dynamically monitoring the grid site job queue, we schedule suitable local cluster jobs to idle grid job slots. This flexible model efficiently provides additional computing resources for experiments