CHEP 07

Name: CHEP 07
Start: 2007-09-02T08:00:00+02:00
End: 2007-09-09T12:00:00+02:00
Location: Victoria, Canada

2–9 Sept 2007

Victoria, Canada

Europe/Zurich timezone

Please book accomodation as soon as possible.

Support

chep07-support@triumf.ca

Grid reliability

3 Sept 2007, 14:00

20m

Carson Hall C (Victoria, Canada)

Carson Hall C

Victoria, Canada

oral presentation Grid middleware and tools Grid middleware and tools

Pablo Saiz (CERN)

Thanks to the grid, users have access to computing resources distributed all over the world. The grid hides the complexity and the differences of its heterogeneous components. In order for this to work, it is vital that all the elements are setuped properly, and that they can interact with each other. It is also very important that errors are detected as soon as possible, and that the procedure to solve them is well established. Our goal is to improve the performance of the grid. In order to do this, we studied two of its main elements: the workload and the data management systems. We developed all tools needed to investigate the efficiency of the different centres. Furthermore, our tools can be used to categorize the most common error messages, and measure their time evolution. One common reason for job failures is site misconfiguration. Being able to detect such a misconfiguration as soon as possible helps in several ways: first of all, it minimizes the time that it takes to bring the site back to a normal state; moreover, debugging it is easier, since the problem happened in the recent past. This can be specially helpful for new centers, since the tools provide the material needed to get a better understanding of the grid's complexity. In this contribution we will describe all the tools that we have developed to monitor the grid efficiency. These tools are currently used by the four LHC experiments. We will also describe the results and benefits that the tools have provided.

Benjamin Gaidioz (CERN) Gerhild Maier (CERN) Juha Herrala (CERN) Julia Andreeva (CERN) Pablo Saiz (CERN) Ricardo Rocha (CERN) catalin Cirstoiu (CERN)

Paper

grid_reliability.pdf

Slides

GR_CHEP07.pdf

GR_CHEP07.ppt

CHEP 07

Support

Grid reliability

Carson Hall C

Victoria, Canada

Speaker

Description

Authors

Presentation materials