21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Distributed monitoring infrastructure for Worldwide LHC Computing Grid

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Distributed Processing and Analysis on Grids and Clouds (track 3) Poster Session

Speaker

Wojciech Lapka (CERN)

Description

The journey of a monitoring probe from its development phase to the moment its execution result is presented in an availability report is a complex process. It goes through multiple phases such as development, testing, integration, release, deployment, execution, data aggregation, computation, and reporting. Further, it involves people with different roles (developers, site managers, VO managers, service managers, management), from different middleware providers (ARC, dCache, gLite, UNICORE and VDT), consortiums (WLCG, EMI, EGI, OSG), and operational teams (GOC, OMB, OTAG, CSIRT). The seamless harmonization of these distributed actors is in daily use for monitoring of the WLCG infrastructure. In this paper we describe the monitoring of the WLCG infrastructure from the operational perspective. We explain the complexity of the journey of a monitoring probe from its execution on a grid node to the visualization on the MyWLCG portal where it is exposed to other clients. This monitoring workflow profits from the interoperability established between the SAM and RSV frameworks. We show how these two distributed structures are capable of uniting technologies and hiding the complexity around them, making them easy to be used by the community. Finally, the different supported deployment strategies, tailored not only for monitoring the entire infrastructure but also for monitoring sites and virtual organizations, are presented and the associated operational benefits highlighted.

Primary author

Co-authors

Presentation materials

There are no materials yet.