Speaker
Description
In this study, we introduce the JIRIAF (JLAB Integrated Research Infrastructure Across Facilities) system, an innovative prototype of an operational, flexible, and widely distributed computing cluster, leveraging readily available resources from Department of Energy (DOE) computing facilities. JIRIAF employs a customized Kubernetes orchestration system designed to integrate geographically dispersed resources into a unified, elastic distributed cluster. This system operates without the need for additional infrastructure investments by resource providers. Notably, JIRIAF has demonstrated a capability to process data streams at rates up to 100 Gbps, facilitating real-time data-stream processing across vast distances.
Furthermore, we developed a digital representation of workflows using a Bayesian probability graph model. This model utilizes a standard joint probability distribution to represent various probabilities associated with the digital state, including relevant quantities and potential rewards, all derived from observed actions and data. The determination of these quantities and rewards employs queueing theory, focusing on two critical metrics: the rate of workflow input and the processing rate. Our results confirm the efficacy of the JIRIAF digital twin in managing and orchestrating highly distributed workflows, showcasing its potential to significantly enhance computational resource utilization and process efficiency in complex environments.