19–25 Oct 2024
Europe/Zurich timezone

Automation and Job Management for LZ Simulations at NERSC

23 Oct 2024, 15:18
57m
Room 4

Room 4

Poster Track 4 - Distributed Computing Poster session

Speaker

Jacopo Siniscalco

Description

The LUX-ZEPLIN (LZ) experiment is a world-leading direct dark matter detection experiment, implementing a dual-phase Xe Time Projection Chamber (TPC) design. The success of the experiment necessitates an in-depth characterization of the pertinent backgrounds, which in turn implies a heavy simulations burden. In this talk, I will present the infrastructure that was developed to allocate and manage the simulations workload on Perlmutter, NERSC’s most recent HPC facility. The pipeline includes a system to automatically generate production configurations based on requests from the simulations team, along with utilites to monitor job progress and success. A RabbitMQ queue is used to coordinate job dispatchement amongst a selection of workers running on specially allocated compute nodes, allowing for fine-grained control over the use of computational resources available.

Primary author

Jacopo Siniscalco

Presentation materials

There are no materials yet.