Speakers
Description
Flexible workload specification and management are critical to the success of the CMS experiment, which utilizes approximately half a million cores across a global grid computing infrastructure for data reprocessing and Monte Carlo production. TaskChain and StepChain specifications, responsible for over 95% of central production activities, employ distinct workflow paradigms: TaskChain executes a single physics payload per grid job, whereas StepChain processes multiple payloads within the same job. As computing resources grow increasingly heterogeneous, encompassing diverse CPU architectures and accelerators, an adaptive workflow specification is essential for efficient scheduling and resource utilization. To address this challenge, we propose a hybrid workflow composition model that dynamically groups tasks based on resource constraints and execution dependencies. This flexible workload construction enhances the adaptability and efficiency of CMS workload management, ensuring optimized resource allocation in an evolving computational landscape.
| Experiment context, if any | CMS experiment |
|---|