Joint EGEE and OSG Workshop on VO Management in Production Grids
HPDC 2008, Boston, MA, USA
HPDC 2008, Boston, MA, USA
Erwin Laure(CERN/EGEE), Miron Livny(Univ. of Wisconsin, Madison)
With the establishment of large scale multidisciplinary production Grid infrastructures such as the EGEE, OSG, DEISA, TeraGrid, or NAREGI, the concept of Virtual Organizations (VO) has been constantly refined and efficient management of VOs and their policies is becoming one of the central topics for these infrastructures.
This workshop is the third of the series on topics in production Grids that was initiated at HPDC-15 with the workshop on Management of Rights in Production Grids and followed by the workshop on Data Handling in Production Grids at HPDC-16.
This workshop will bring together practitioners and researchers on all aspects of VO management to discuss capabilities of existing technologies, identify areas where new functionalities are needed and explore how latest research results can be integrated into the software stack and procedures of production Grids.
Topics include, but are not restricted to:
Scalable VO management systems, establishment and closure of VOs
Mapping VO policies to the underlying resources
Managed vs. opportunistic resource usage, usage across VOs
Erwin Laure(CERN), Miron Livny(Univ. of Wisconsin, Madison)
The Sociology of the Grid1h
(Joint work with Carl Kesselman, USC/ISI)
The term virtual organization (VO), when used to denote a dynamic collection of individuals, institutions, and resources united by some common interest or task, has emerged as a popular, and presumably useful, organizing principle in distributed systems. It is common to see systems being deployed to support one or more VOs, policies may be expressed in terms of VOs, and services are required to support the creation and evolution of VOs.
Unfortunately, the popularity of the term has led to a lack of clarity in its meaning: at the limit, a VO could variously denote a multi-decadal scientific collaboration, a commercial outsourcing relationship, a weblog, or an email exchange between two individuals. Yet presumably these different scenarios vary greatly in their requirements for IT infrastructure support, security, reliability, performance, cost, and so on, and may benefit from different technical solutions. This lack of clarity hinders both communication and the identification of required tools.
Thus, we seek in this talk to clarify the VO concept and its implications for distributed system implementation--to define a "sociology of the grid."
VO Management in Production Grids
User management in DEISA30m
DEISA started in 2004 as an an infrastructure project of European HPC centres and currently continues as DEISA2. The objective is to enhance HPC opportunities for user communities by providing easy access to all systems. The vision is to have single sign on functionality and easy to understand authorisation policies. This presentation will discuss how the DEISA user administration is set up and which policies are implemented for authentication and authorization. The possibilities for interoperability with other infrastructures from the authentication and authorization perspective will be discussed.
Virtual Organization Management in EGEE30m
EGEE (Enabling Grids for eScience) operates a large scale production
Grid infrastructure federating over 250 sites from 48 countries
world-wide providing over 45000 CPUs and about 15 PB of disk storage to
a wide variety of scientific applications. Users in EGEE are organized in over 200 Virtual Organizations (VO) which allows them to collectively gain access to resources allocated to the VO. In this talk we discuss how EGEE defines the VO concept including different classes based on their geographic reach, explain how VOs can be set up, how to negotiate resource usage and what implications this has on sites.
The OSG – a VO Centric VO30m
The blue print for the Open Science Grid (OSG) that was developed four years ago states that “The OSG architecture is Virtual Organization based”. A VO is considered as party to contracts between Resource Providers & VOs which govern resource usage & policies and may consist of sub-VOs which operate under the contracts of the parent. We will discuss how this VO centric view has influenced the organizational structure of the OSG, the way we interact with other grids, the services we provide and our day-to-day operational procedures
Miron Livny(Univ. of Wisconsin, Madison)
TeraGrid Science Gateways30m
The TeraGrid, formed in 2003 is one of the world's largest open academic
high performance computing grids. The Science Gateway program,
developed in 2005 is a fundamentally new approach to providing high
performance computing, storage and visualization capabilities to
researchers. The program enables entire communities of users associated
with a common scientific goal to use national resources through a common
interface. Science gateways are enabled by a community allocation whose
goal is to delegate account management, accounting, certificates
management, and user support to the gateway developers. This talk will
describe the Gateway approach to VO management.
Tools and Techniques
VOMS is a VO management tool responsible for categorizing users into different groups, and assigning them attributes to be used for the authorization process.
DrVincenzo Ciaschini(INFN CNAF)
The Virtual Organization Management Registration Service (VOMRS), developed at Fermilab, provides a comprehensive set of services that facilitates management of VO membership and privileges. VOMRS is compliant with all major JSPG requirements. It implements a registration workflow, VO usage policy acceptance, events notification. VOMRS allows multiple grid certificates per member, supports membership and certificate status. It is capable of interfacing the third party systems (such as CERN Human Resource database, Fermilab CNAS, etc) to pull relevant member’s information from them. VOMRS could be synchronized with VOMS. VOMRS is deployed and is used in production at Fermilab, CERN, BNL, Texas Tech, Desy and APAC.
We are currently discussing with VOMS team the possibility of converging on a single virtual organization management solution, addressing the use cases implemented by VOMRS and VOMS. We will talk about the advantage of this approach, present the revised requirements for VO Registration Service and features required in new VOMS implementation.
VO Authorization in EGEE30m
Access to resources offered through the EGEE infrastructure is governed by VO membership. This coarse grained authorization enables basic usage, however, more fine grained authorization mechanisms based on groups and roles within a VO are needed for larger VOs. In this talk we give some motivating application examples, discuss short term solutions being put in place on the EGEE infrastructure and give an outlook on a revised authorization service that is currently being designed.
GlideinWMS is a pilot-based Workload Management System built on top of the Condor batch system. The glideinWMS extensions are responsible for configuring the pilot jobs and sending them to the appropriate Grid sites using Condor-G. The user jobs are handled by the Condor batch system, and the pilot itself is just a properly configured Condor process.
The glideinWMS is sponsored by USCMS and developed at Fermilab. It is curently being used by CMS and MINOS. CDF is currently using a slightly different flavor of a glidein system, but is planning to switch to the glideinWMS.
Security and VO management enhancements in Panda30m
Panda is a comprehensive Workload Management System that performs aggregation of job requests, allocation of Grid resources to requests according to pre-defined criteria, and tracking of job execution. Central to the concept of Panda is the use of pilot jobs which probe the environment on the remote worker node, before pulling down the payload job from the server and executing it. Such design allows for improved logging and monitoring capabilities, made possible by the pilot being a "smart wrapper" for the payload job. Recently, we enhanced the overall security of the Panda system by optionally allowing the pilot job to change UID via the use of glexec wrapper. This is based on credentials of the end user (requestor) of the payload job, deposited on a caching service (MyProxy) and retrieved by the pilot prior to payload execution. This brings our product into alignment with policies agreed upon by both OSG and EGEE. In addition, work is under way to implement a resource allocation mechanism within Panda, based on user-VO affiliation and resources allocated by individual sites to VOs.
DrMaxim Potekhin(BROOKHAVEN NATIONAL LABORATORY)