27 September 2004 to 1 October 2004
Interlaken, Switzerland
Europe/Zurich timezone

Use of Condor and GLOW for CMS Simulation Production

27 Sept 2004, 17:30
20m
Ballsaal (Interlaken, Switzerland)

Ballsaal

Interlaken, Switzerland

oral presentation Track 5 - Distributed Computing Systems and Experiences Distributed Computing Systems and Experiences

Speaker

S. Dasu (UNIVERSITY OF WISCONSIN)

Description

The University of Wisconsin distributed computing research groups developed a software system called Condor for high throughput computing using commodity hardware. An adaptation of this software, Condor-G, is part of Globus grid computing toolkit. However, original Condor has additional features that allows building of an enterprise level grid. Several UW departments have Condor computing pools that are integrated in such a way as to flock jobs from one pool to another as resources become available. An interdisciplinary team of UW researchers recently built a new distributed computing facility, the Grid Laboratory of Wisconsin (GLOW). In total Condor pools in the UW have about 2000 Intel CPUs (P-III and Xeon) which are available for scientific computation. By exploiting special features of Condor such as checkpointing and remote IO we have generated over 10 million fully simulated CMS events. We were able to harness about 260 CPU-days per day for a period of 2 months when we were operational late fall. We have scaled to using 500 CPUs concurrently when opportunity to exploit unused resources in laboratories on our campus. We have built a scalable job submission and tracking system called Jug using Python and mySQL which enabled us to scale to run hundreds of jobs simultaneously. Jug also ensured that the data generated is transferred to US Tier-I center at Fermilab. We have also built a portal to our resources and participated in Grid2003 project. We are currently adapting our environment for providing analysis resources. In this paper we will discuss our experience and observations regarding the use of opportunistic resources, and generalize them to wider grid computing context.

Primary authors

D. Bradley (UNIVERSITY OF WISCONSIN) M. Livny (UNIVERSITY OF WISCONSIN) S. Dasu (UNIVERSITY OF WISCONSIN) S. Rader (UNIVERSITY OF WISCONSIN) V. Puttabuddhi (UNIVERSITY OF WISCONSIN) W. Smith (UNIVERSITY OF WISCONSIN)

Presentation materials