pre-GDB [WLCG Auth & IAM users Meeting] *remote-only*

Europe/Zurich
31/S-028 (CERN)

31/S-028

CERN

30
Show room on map
Thomas Dack
Description

This months pre-GDB will focus on Authentication and Authorization work done by both the WLCG Authz WG and the wider community. The meeting will be split into two afternoons, with two general themes - an INDIGO IAM Users Meeting, and WLCG AuthZ meeting.

Monthly meeting of the WLCG Grid Deployment Board See also Twiki GDB area for actions and summaries

Registration
Participants
Participants
  • Adrian Crenshaw
  • Alastair Basden
  • Alessandra Forti
  • Alison Packer
  • Alvaro Fernandez Casani
  • Andrea Ceccanti
  • Andrea Rendina
  • Andrew McNab
  • Andrzej Olszewski
  • Baptiste Grenier
  • Benjamin Jacobs
  • Daniele Lattanzio
  • Darren Moore
  • David Crooks
  • David Kelsey
  • Dennis van Dok
  • Diana Gudu
  • Dirk Jahnke-Zumbusch
  • Doina Duma
  • Dominik František Bučík
  • Doug Benjamin
  • Emmanouil Vamvakopoulos
  • Enrico Vianello
  • Eric Fede
  • Eric Yen
  • Federica Agostini
  • Federico Fornari
  • Greg Corbett
  • Hannah Short
  • Ian Collier
  • Ian Johnson
  • James Walder
  • Jens Jensen
  • Joao Antonio Tomasio Pina
  • John Good
  • John Steven De Stefano Jr
  • Jose Caballero Bejar
  • Jose Flix Molina
  • Joshua Drake
  • Kyle Pidgeon
  • Linda Ann Cornwall
  • Lucia Morganti
  • Maarten Litmaath
  • Maiken Pedersen
  • Marcelo Vilaça Pinheiro Soares
  • Martin Barisits
  • Mary Hester
  • Matt Snyder
  • Michel Jouvin
  • Mischa Sallé
  • Muhammad Aleem Sarwar
  • Nick Evangelou
  • Nicolas Liampotis
  • Ofer Rind
  • Oxana Smirnova
  • Patrick Fuhrmann
  • Paul Millar
  • Peter van der Reest
  • Raja Nandakumar
  • Renato Santana
  • Riccardo Di Maria
  • Rizart Dona
  • Rob Appleyard
  • Roberta Miccoli
  • Rohini Joshi
  • Rose Cooper
  • Sam Glendenning
  • Samuel Cadellin Skipsey
  • Stefano Dal Pra
  • Sven Gabriel
  • Thomas Birkett
  • Thomas Dack
  • Thomas Hartmann
  • Timothy John Noble
  • Valeria Ardizzone
  • Will Furnell
Zoom Meeting ID
63063740502
Host
Hannah Short
Useful links
Join via phone
Zoom URL
    • Day 1: INDIGO IAM Users Meeting
      • 2:00 PM
        Welcome & Introductions
      • 1
        INDIGO IAM: status and development roadmap
        Speaker: Andrea Ceccanti (Universita e INFN, Bologna (IT))
      • 2
        Multi-Factor Auth for the IAM
        Speaker: Sam Glendenning (Science and Technology Facilities Council)
      • 3:05 PM
        Break
      • 3
        Command Line OIDC Authentication with PAM
        Speaker: Jens Jensen (STFC)
      • 4
        Federated access for SSH with OIDC
        Speaker: Diana Gudu
      • 5
        Tokens in dCache
        Speaker: Paul Millar
      • 4:20 PM
        Break
      • 6
        Federated access for SSH with MFA in ELIXIR AAI
        Speaker: Dominik František Bučík (ICS MUNI)
      • 7
        CERN IAM Deployment Update (WLCG VOs)
        Speaker: Hannah Short (CERN)
      • 8
        Discussion Session: Policy for IAM

        Planned Topics
        - The role of IGTF in Token Issuer governance
        - Security evaluation of token based infrastructures

        Speaker: Hannah Short (CERN)

        Notes Policy IAM pre-GDB

        • DavidC: could imagine that small number of trusted issuers means policy not so essential at the moment. But what if the number grows?
        • Emmanouil: In the new system we are putting more distance between token issuer and infrastructure. Credential closer to the institute than the grid.
        • Douglas: IGTF is a clearing house where rules are followed. We are going to have to deal with multiple issuers, having an IGTF equivalent makes life a lot easier. What about when tokens are coming from e.g. Google
          • DavidC: New Working Group is good place to have this discussion about non-IGTF certs
          • Alessandra: IGTF needs to evolve to include commercial clouds, and consider what happens to national CAs when no longer really needed for user certs (still worth it for host certs?)
        • Maarten: agree with E, shift towards trusting VOs more. No longer the distributed aspect. Previously there were not checks that VOMS services were run properly (H: to be checked). IAM taking more responsibility. Sites should have some concern about this. We need, at very least, a set of good practices for running a token issuer
        • Andrea: CAs used to issue auth credentials, correct comparison would be the originating IdP (e.g. CERN SSO). IAM is more similar to VOMS server. Must ensure that the infrastructure knows how to handle tokens, e.g. validates them, checks LoA. Policy at the site/service level. We will need to build on JWT profile to cover policy aspects that would be standardised in e.g. IGTF. Policies must support modern deployment models.
        • Oksana: we will not be given money for hardware, we will have no choice about who we trust as a host. Small sites will eventually disappear. The sites will be more reluctant to trust us, e.g. HPC sites may not trust all CERN Users. We will need to engage with the external community, particularly the resource providers.
          • Alessandra: given that the resources will have more power, does IGTF still make sense? All infrastructure was based on issuing thousands of certificates.
        • Maarten: several ongoing projects to see how WLCG can work better with HPC centres. These resource providers may not collaborate on security investigations. The resources trust task force will be looking at these issues.
        • Douglas: how are we going to ensure interoperability with multiple token issuers? E.g. DUNE is non-wlcg, will this work?
          • Andrea: there is a test suite, first implementation exists
          • Maarten: next JWT schema will be less wlcg-centric to aid interoperability
        • Paul: we should set up a conformance test suite for token issuers
          • Andrea: some ongoing in data lake project but minimal
      • 5:55 PM
        Open/further discussion & Wrap Up
    • Day 2: AuthZ Discussion Sessions
      • 2:00 PM
        Welcome & Introductions
      • 9
        Session 1: Generation of Gridmap Files using IAM
        Speakers: David Crooks (UKRI STFC), Hannah Short (CERN)

        Notes grid map file

         

        • Do we need to be tracking gridmap file usage atm?
        • We need to test during this transition period where we have IAM and VOMS available in parallel.
        • Two main use cases, CASTOR and EOS
        • Can no longer just have open access to this information due to GDPR
          • Already in VOMS need special privileges
          • Should anyone be able to see who is in CMS VO just because they have a certificate? You can browse the list of DNs but cannot get any more information atm
          • In IAM must have a registered, authorised client
        • IAM API has been used by ESCAPE and EOS will use soon
        • Will want to do similar with token subjects
        • Mapping to local accounts is done per service, VOMS releases DNs and groups/roles. Logic has to be done by service
        • Does IAM have a `role` available that a site could use to select a subset of users?
        • Roles (i.e. optional groups) and principal group exist in IAM (backwards compatible)
        • At last hackathon agreement that would be good to have an interface defined that would be like lcmaps
        • What are we missing?
          • IAM does already support generating the same data for DNs
            • Will need better interfaces for client registration and management
          • We are missing an agreement on how the mapping will look for token based use cases
          • Petr wants support for user directories in token based model, not clear what storage should provide. Does the API provide what is needed? Need to provide some guidance on home directory use
          • Support for capabilities more complex
        • IAM can be extended as we need but we need to be careful not to put service level logic into IAM
        • Next steps
          • Doc with
            • Current spec (Maarten?)
            • IAM backwards compatible spec (Andrea)
            • IAM token spec
            • Any edge cases we need to consider

         

        Actions

        • Maarten to go through comments on timeline
        • Hannah start a doc and send it around to list
      • 3:05 PM
        Break
      • 10
        Session 2: Traceability & Suspension
        Speaker: David Crooks (UKRI STFC)

        Notes traceability and suspension

        • Particularly interested in site perspectives
        • What does central suspension mean? Overlap with how workflows are implemented, e.g. jobs should not just be checked on entry
        • We need to define what should be blockable
        • Bar should be sufficiently high but not overly expensive
        • We need to be reactive, reachability as well as traceability. Must define “reasonable time scale”
        • Need tools to be alerted very quickly as well
        • Better to block first and ask questions later
        • Pilot jobs mean that have to resort to wide blocking
        • Focus on person to person processes rather than technical measures (to a reasonable extent)
          • Would rather spend lots of time here rather than create tools that don’t work
          • Need to get these contacts clear and available
        • Because we have more of a gap between the grid and the end IdP blocking might take longer, though can block quickly within the grid domain
        • Traceability working group was working on this topic and now on hiatus
        • Next steps
          • David to start a doc with
            • understanding of what needs to be blocked
            • schematic of person to person contacts and processes
            • Ensure traceability WG outcomes are considered
      • 4:15 PM
        Break
      • 11
        Session 3: ARC needs for tokens
        Speaker: Hannah Short (CERN)

        Notes ARC CE

        • Used like any other CE, payloads submitted no data handling. VOs handle file transfers and credentials.
        • But also can offer more features, used particularly by HPC. e.g. NDGF. Have to play by rules of local sites i.e. no pilots. Worker nodes may be offline. ARC CE does pre-fetching of data (normally cached). Also output data handling (authorisation required) and cache cleanup.
        • File name list input with job description, states what needs to exist at the site for the job to run. Output files may be generated by the job, not always known in advance.
        • ARC CE workflow
          • Downloads files if they don’t already exist (delegated authZ with VOMS roles)
          • When all inputs in place, job goes to batch system. Executed off-line, doesn’t normally need access to credentials
          • Job leaves Batch system, then ARC CE takes over and uploads files to destination. At this point credential renewal is often needed
          • Credentials may be needed for accounting, cache clean up or cancellation etc.
        • Different kind of jobs; production, monitoring etc but share general workflow
        • We need people from experiment frameworks to participate in this discussion
        • AuthZ WG should come out with some recommendations
          • Some are already out there since e.g. htcondor has implemented them
        • Hackathon might be appropriate to make progress (last CE hackathon https://indico.cern.ch/event/1032742/) see GlideinWMS slides
        • We started defining capabilities for compute and storage, some assumption that different tokens would be used
        • It will be down to the VOs for how they decide to fill the tokens
        • Steep learning curve when coming in to OAuth
          • Need very clear recommendations with few choices
        • Pilot case largely solved for ARC, how?
          • Pilot infrastructure takes care of token success, control etc. Fewer questions.
          • Pilot jobs the pilots do the file movements
        • VO frameworks important, how can we get them involved?
          • Panda (need to find out what’s been going on)
          • Dirac (hopefully covered by Andrii)
          • GlideinWMS (CMS, covered by Brian)
        • HTcondor plans to shut down X.509 support by autumn next year
        • Timeline https://docs.google.com/document/d/11fcZU8fEsfjDiSkjh95nVr4tNXLPCA_xwr2SwriBpiw/edit#heading=h.s0j7quda1urv
        • Grid community plans to keep Globus alive during run 3

         

        Questions:

        1. How can a token be acquired?
          1. For programmatic access use standard OAuth protocols to get tokens from token issuer, e.g. client credentials flow
            1. Useful for e.g. pilot submission where the token is not issued to an end user
            2. (Let’s drop pilot discussion since that’s clear, let’s focus on the user token flows)
          2. Also some tools for users to get user tokens
            1. Can we assume that the token can be issued and refreshed transparently?
            2. Do we need to know what’s in the tokens?
            3. Would expect that get a single token for a job

         

        Actions

        • WLCG AuthZ WG to come up with recommendations and guidelines
          • Currently looking at this for FTS and Rucio (experiment developers needed)
          • Will try and extract more generic recommendations afterwards
      • 12
        Open/further discussion & Wrap Up