21-27 March 2009
Prague
Europe/Prague timezone

Interoperability and Scalability within glideinWMS

26 Mar 2009, 08:00
1h
Prague

Prague

Prague Congress Centre 5. května 65, 140 00 Prague 4, Czech Republic
Board: Thursday 025
poster Grid Middleware and Networking Technologies Poster session

Speaker

Daniel Bradley (University of Wisconsin)

Description

Physicists have access to thousands of CPUs in grid federations such as OSG and EGEE. With the start-up of the LHC, it is essential for individuals or groups of users to wrap together available resources from multiple sites across multiple grids under a higher user-controlled layer in order to provide a homogeneous pool of available resources. One such system is glideinWMS, which is based on the Condor batch system. A general discussion of glideinWMS can be found elsewhere. Here, we focus on recent advances in extending its reach: scalability and integration of heterogeneous compute elements. We demonstrate that the new developments achieve the design goal of over 10,000 simultaneous running jobs under a single Condor schedd, using strong security protocols across global networks, and sustaining a steady-state job completion rate of a few Hz. We also show interoperability across heterogeneous computing elements achieved using client-side methods. We discuss this technique and the challenges in direct access to NorduGrid and CREAM compute elements, in addition to Globus based systems.

Primary authors

Daniel Bradley (University of Wisconsin) Igor Sfiligoi (Fermilab) Jaime Frey (University of Wisconsin) Sanjay Padhi (University of California, San Diego) Todd Tannenbaum (University of Wisconsin)

Presentation Materials