Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

19–25 Oct 2024
Europe/Zurich timezone

Surrogate Modeling for Scalable Evaluation of Distributed Computing Systems for HEP applications

TUE 02
22 Oct 2024, 15:18
57m
Exhibition Hall

Exhibition Hall

Poster Track 4 - Distributed Computing Poster session

Speaker

Larissa Schmid (KIT - Karlsruhe Institute of Technology (DE))

Description

The sheer volume of data generated by LHC experiments presents a computational challenge, necessitating robust infrastructure for storage, processing, and analysis. The Worldwide LHC Computing Grid (WLCG) addresses this challenge by integrating global computing resources into a cohesive entity. To cope with changes in the infrastructure and increased demands, the compute model needs to be adapted. Simulations of different compute models present a feasible approach for evaluating different design candidates. However, running these simulations incurs a trade-off between accuracy and scalability. For example, while the simulator DCSim can provide accurate results, it falls short on scalability when increasing the size of the simulated platform. Generative Machine Learning as a surrogate is successfully used to overcome these limitations in other domains that exhibit similar trade-offs between scalability and accuracy, such as the simulation of detectors.

In our work, we evaluate the usage of three different machine learning models as surrogate models for the simulation of distributed computing systems and assess their ability to generalize to unseen jobs and platforms. We show that those models can predict the simulated platforms' main observables derived from the execution traces of compute jobs with approximate accuracy. Potential for further improving the predictions lies in using other machine learning models and different encodings of the platform-specific information to achieve better generalizability for unseen platforms.

Primary authors

Larissa Schmid (KIT - Karlsruhe Institute of Technology (DE)) Maximilian Maria Horzela (KIT - Karlsruhe Institute of Technology (DE)) Valerii Zhyla (KIT - Karlsruhe Institute of Technology (DE)) Manuel Giffels (KIT - Karlsruhe Institute of Technology (DE)) Gunter Quast (KIT - Karlsruhe Institute of Technology (DE)) Anne Koziolek (KIT - Karlsruhe Institute of Technology (DE))

Presentation materials