14–18 Oct 2013
Amsterdam, Beurs van Berlage
Europe/Amsterdam timezone

CMS experience of running glideinWMS in High Availability mode

14 Oct 2013, 15:00
45m
Grote zaal (Amsterdam, Beurs van Berlage)

Grote zaal

Amsterdam, Beurs van Berlage

Poster presentation Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization Poster presentations

Speaker

Mr Igor Sfiligoi (University of California San Diego)

Description

The CMS experiment at the Large Hadron Collider is relying on the HTCondor-based glideinWMS batch system to handle most of its distributed computing needs. In order to minimize the risk of disruptions due to software and hardware problems, and also to simplify the maintenance procedures, CMS has set up its glideinWMS instance to use most of the attainable High Availability (HA) features. The setup involves running services distributed over multiple nodes, which in turn are located in several physical locations, including Geneva, Switzerland, Chicago, Illinois and San Diego, California. This paper describes the setup used by CMS, the HA limits of this setup, as well as a description of the actual operational experience spanning many months.

Primary author

Mr Igor Sfiligoi (University of California San Diego)

Co-author

Ian Fisk (Fermi National Accelerator Lab. (US))

Presentation materials