10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

Developing the Traceability Model to meet the Requirements of an Evolving Distributed Computing Infrastructure

11 Oct 2016, 15:30
1h 15m
San Francisco Marriott Marquis

San Francisco Marriott Marquis

Poster Track 8: Security, Policy and Outreach Posters A / Break

Speaker

Ian Collier (STFC - Rutherford Appleton Lab. (GB))

Description

The growing use of private and public clouds, and volunteer computing are driving significant changes in the way large parts of the distributed computing for our communities are carried out. Traditionally HEP workloads within WLCG were almost exclusively run via grid computing at sites where site administrators are responsible for and have full sight of the execution environment. The experiment virtual organisations (VOs) are increasingly taking more control of those execution environments. In addition, the development of container and control group technologies offer new possibilities for isolating processes and workloads.

The absolute requirement for detailed information allowing incident response teams to answer the basic questions of who, did what, when and where, remains. But we can no longer rely on central logging from within the execution environment at resource providers sites to provide all of that information. Certainly, in the case of commercial public cloud providers that information is unlikely to be accessible at all. Shifting the focus to the externally observable behaviour of processes (including virtual machines and containers) and looking to the VO workflow management systems for user identification would be one approach to ensuring the required traceability.

The newly created WLCG Traceability & Isolation Working Group is investigating the feasibility of the technologies involved and developing technical requirements both for gathering traceability information from VO workflow management and technologies for separating processes and isolating processes and so protecting users and their data from one another.

We discuss the technical requirements as well as the policy issues raised and make some initial proposals for solutions to these problems.

Primary Keyword (Mandatory) Security and policies
Secondary Keyword (Optional) Distributed workload management

Primary author

Ian Collier (STFC - Rutherford Appleton Lab. (GB))

Co-authors

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB)) Dave Kelsey (STFC - Rutherford Appleton Lab. (GB)) Romain Wartel (CERN) Vincent Brillault (CERN)

Presentation materials