Deploying a new realtime XRootD-v5 based monitoring framework for GridPP

19 May 2021, 10:50
13m
Short Talk Distributed Computing, Data Management and Facilities Facilities and Networks

Speaker

Dr Robert Andrew Currie (The University of Edinburgh (GB))

Description

To optimise the performance of distributed compute, smaller lightweight storage caches are needed which integrate with existing grid computing workflows. A good solution to provide lightweight storage caches is to use an XRootD-proxy cache. To support distributed lightweight XRootD proxy services across GridPP we have developed a centralised monitoring framework.

With the v5 release of XRootD it is possible to build a monitoring framework which collects distributed caching metadata broadcast from multiple sites. To provide the best support for these distributed caches we have built a centralised monitoring service for XRootD storage instances within GridPP. This monitoring solution is built upon experiences presented by CMS in setting up a similar service as part of their AAA system. This new framework is designed to provide remote monitoring of the behaviour, performance, and reliability of distributed XRootD services across the UK. Effort has been made to simplify ease of deployment by remote site administrators.

The result of this work is an interactive dashboard system which enables administrators to access real-time metrics on the performance of their lightweight storage systems. This monitoring framework is intended to supplement existing functionality and availability testing metrics by providing detailed information and logging from a site perspective.

Primary author

Dr Robert Andrew Currie (The University of Edinburgh (GB))

Co-author

Wenlong Yuan (The University of Edinburgh (GB))

Presentation materials

Proceedings

Paper