21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Experience in Grid Site Testing for ATLAS, CMS and LHCb with HammerCloud

22 May 2012, 16:35
25m
Room 914 (Kimmel Center)

Room 914

Kimmel Center

Parallel Distributed Processing and Analysis on Grids and Clouds (track 3) Distributed Processing and Analysis on Grids and Clouds

Speaker

Daniel Colin Van Der Ster (CERN)

Description

Frequent validation and stress testing of the network, storage and CPU resources of a grid site is essential to achieve high performance and reliability. HammerCloud was previously introduced with the goals of enabling VO- and site-administrators to run such tests in an automated or on-demand manner. The ATLAS, CMS and LHCb experiments have all developed VO plugins for the service and have successfully integrated it into their grid operations infrastructures. This work will present the experience in running HammerCloud at full scale for more than 3 years and present solutions to the scalability issues faced by the service. First, we will show the particular challenges faced when integrating with CMS and LHCb offline computing, including customized dashboards to show site validation reports for the VOs and a new API to tightly integrate with the LHCbDIRAC Resource Status System. Next, a study of the automatic site exclusion component used by ATLAS will be presented along with results for tuning the exclusion policies. A study of the historical test results for ATLAS, CMS and LHCb will be presented, including comparisons between the experiments' grid availabilities and a search for site-based or temporal failure correlations. Finally, we will look to future plans that will allow users to gain new insights into the test results; these include developments to allow increased testing concurrency, increased scale in the number of metrics recorded per test job (up to hundreds), and increased scale in the historical job information (up to many millions of jobs per VO).
Student? Enter 'yes'. See http://goo.gl/MVv53 no

Primary author

Co-authors

Dr Andrea Sciaba (CERN) Federica Legger (Ludwig-Maximilians-Univ. Muenchen (DE)) Johannes Elmsheuser (Ludwig-Maximilians-Univ. Muenchen (DE)) Mario Ubeda Garcia (CERN) Ramon Medrano Llamas (Universidad de Oviedo (ES))

Presentation materials