Conference on Computing in High Energy and Nuclear Physics

Name: Conference on Computing in High Energy and Nuclear Physics
Start: 2024-10-19T08:00:00+02:00
End: 2024-10-25T18:30:00+02:00
Location: No location set

19–25 Oct 2024

Europe/Zurich timezone

Contact Program Chairs

chep2024-pc@cern.ch

Surrogate Modeling for Scalable Evaluation of Distributed Computing Systems for HEP applications

TUE 02

22 Oct 2024, 15:18

57m

Exhibition Hall

Poster Track 4 - Distributed Computing Poster session

Larissa Schmid (KIT - Karlsruhe Institute of Technology (DE))

The sheer volume of data generated by LHC experiments presents a computational challenge, necessitating robust infrastructure for storage, processing, and analysis. The Worldwide LHC Computing Grid (WLCG) addresses this challenge by integrating global computing resources into a cohesive entity. To cope with changes in the infrastructure and increased demands, the compute model needs to be adapted. Simulations of different compute models present a feasible approach for evaluating different design candidates. However, running these simulations incurs a trade-off between accuracy and scalability. For example, while the simulator DCSim can provide accurate results, it falls short on scalability when increasing the size of the simulated platform. Generative Machine Learning as a surrogate is successfully used to overcome these limitations in other domains that exhibit similar trade-offs between scalability and accuracy, such as the simulation of detectors.

In our work, we evaluate the usage of three different machine learning models as surrogate models for the simulation of distributed computing systems and assess their ability to generalize to unseen jobs and platforms. We show that those models can predict the simulated platforms' main observables derived from the execution traces of compute jobs with approximate accuracy. Potential for further improving the predictions lies in using other machine learning models and different encodings of the platform-specific information to achieve better generalizability for unseen platforms.

Larissa Schmid (KIT - Karlsruhe Institute of Technology (DE)) Maximilian Maria Horzela (KIT - Karlsruhe Institute of Technology (DE)) Valerii Zhyla (KIT - Karlsruhe Institute of Technology (DE)) Manuel Giffels (KIT - Karlsruhe Institute of Technology (DE)) Gunter Quast (KIT - Karlsruhe Institute of Technology (DE)) Anne Koziolek (KIT - Karlsruhe Institute of Technology (DE))

Surrogate-Modeling.pdf

Conference on Computing in High Energy and Nuclear Physics

Contact Program Chairs

Surrogate Modeling for Scalable Evaluation of Distributed Computing Systems for HEP applications

Exhibition Hall

Speaker

Description

Authors

Presentation materials

Choose timezone

Conference on Computing in High Energy and Nuclear Physics

Contact Program Chairs

Speaker

Description

Authors

Presentation materials