21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Recent Improvements in the ATLAS PanDA Pilot

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Distributed Processing and Analysis on Grids and Clouds (track 3) Poster Session

Speaker

Paul Nilsson (University of Texas at Arlington (US))

Description

The Production and Distributed Analysis system (PanDA) in the ATLAS experiment uses pilots to execute submitted jobs on the worker nodes. The pilots are designed to deal with different runtime conditions and failure scenarios, and support many storage systems. This talk will give a brief overview of the PanDA pilot system and will present major features and recent improvements including CERNVM File System integration, file transfers with Globus Online, the job retry mechanism, advanced job monitoring including JEM technology, and validation of new pilot code using the HammerCloud stress-­‐testing system. PanDA is used for all ATLAS distributed production and is the primary system for distributed analysis. It is currently used at over 100 sites world-­‐wide. We analyze the performance of the pilot system in processing LHC data on the OSG, LCG and Nordugrid infrastructures used by ATLAS, and describe plans for its further evolution.

Author

Co-authors

Dr Alden Stradling (University of Texas at Arlington (US)) Carlos Contreras (Departamento de Fisica-Univ. Tecnica Federico Santa Maria (UTFSM) Dr Jose Caballero Bejar (Brookhaven National Laboratory (US)) Kaushik De (University of Texas at Arlington (US)) Maxim Potekhin (Brookhaven National Laboratory (US)) Paul Nilsson (University of Texas at Arlington (US)) Tadashi Maeno (Brookhaven National Laboratory (US)) Tim Dos Santos (Bergische Universitaet Wuppertal (DE)) Dr Torre Wenaus (Brookhaven National Laboratory (US))

Presentation materials