21–25 Aug 2017
University of Washington, Seattle
US/Pacific timezone

ATLAS BigPanDA Monitoring

22 Aug 2017, 16:00
45m
The Commons (Alder Hall)

The Commons

Alder Hall

Poster Track 1: Computing Technology for Physics Research Poster Session

Speaker

Siarhei Padolski (BNL)

Description

BigPanDA monitoring is a web based application which provides various processing and representation of the Production and Distributed Analysis (PanDA) system objects states. Analyzing hundreds of millions of computation entities such as an event or a job BigPanDA monitoring builds different scale and levels of abstraction reports in real time mode. Provided information allows users to drill down into the reason of a concrete event failure or observe system bigger picture such as tracking the computation nucleus and satellites performance or the progress of whole production campaign. PanDA system was originally developed for the Atlas experiment and today effectively managing more than 2 million jobs per day distributed over 170 computing centers worldwide. BigPanDA is its core component commissioned in the middle of 2014 and now is the primary source of information for ATLAS users about state of their computations and the source of decision support information for shifters, operators and managers. In this work we describe evolution of the architecture, current status and plans for development of the BigPanDA monitoring.

Primary authors

Torre Wenaus (Brookhaven National Laboratory (US)) Siarhei Padolski (BNL) Tatiana Korchuganova (National Research Tomsk Polytechnic University (RU)) Alexei Klimentov (Brookhaven National Laboratory (US))

Co-author

Aleksandr Alekseev (National Research Tomsk Polytechnic University (RU))

Presentation materials

Peer reviewing

Paper