24th International Conference on Computing in High Energy & Nuclear Physics

Name: 24th International Conference on Computing in High Energy & Nuclear Physics
Start: 2019-11-04T08:00:00+10:30
End: 2019-11-08T13:00:00+10:30
Location: Adelaide Convention Centre

4–8 Nov 2019

Adelaide Convention Centre

Australia/Adelaide timezone

Contact us

Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's by the ATLAS experiment

7 Nov 2019, 11:30

15m

Riverbank R1 (Adelaide Convention Centre)

Riverbank R1

Adelaide Convention Centre

Oral Track 9 – Exascale Science Track 9 – Exascale Science

Doug Benjamin (Argonne National Laboratory (US))

The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to orchestrate the MPI communication between Harvester and the simulation payload running on over 1000 nodes simultaneously. In this way over 130,000 cores can simultaneously produce simulated Monte Carlo events for ATLAS. The PanDA system also had to be changed to produce "jumbo jobs" capable of simulated over 1 Million events per submission to the local HPC scheduling systems.
This presentation will describe in detail the changes to PanDA to enable jumbo jobs and the Yoda-Droid software. Scaling and efficiency measurements will be presented. Results from deployment, integration and operation of the new software at the Titan, Cori and Theta HPC machines will be shown.

Consider for promotion	Yes

Doug Benjamin (Argonne National Laboratory (US)) Wen Guan (University of Wisconsin (US)) Tadashi Maeno (Brookhaven National Laboratory (US)) Nicolo Magini (Iowa State University (US)) Paul Nilsson (Brookhaven National Laboratory (US)) Danila Oleynik (Joint Institute for Nuclear Research (RU)) Vakho Tsulaia (Lawrence Berkeley National Lab. (US)) Taylor Childers (Argonne National Laboratory (US)) Martina Javurkova (Albert Ludwigs Universitaet Freiburg (DE))

ATL-COM-SOFT-2019-103.pdf

ATL-COM-SOFT-2019-103.pptx

24th International Conference on Computing in High Energy & Nuclear Physics

Contact us

Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's by the ATLAS experiment

Riverbank R1

Adelaide Convention Centre

Speaker

Description

Authors

Presentation materials