Petr, Andrei, Alexandre, Andrew, Angela, Berk, Dave, David, Dimitrios, Douglas, Federica, Linda, Mischa, Roberta, Sven, Hannah, Maarten
- Petr: IAM Issues
- Hannah: access token length & revocation
- Recommended 20 minutes (these are not revocable whereas refresh tokens are). This is so that blocked users are kicked out within a reasonable time.
- CMS and ATLAS have configured 4 day tokens
- Concern that if IAM goes down then everything stops so preferable to have long tokens
- If IAM does go down we need a way to publish the signing keys anyway
- From the user perspective the X.509 proxy works effectively a working day (or 24 hours). These proxies cannot be revoked (though the user certificate could be revoked but this would not stop the job). In reality this hasn't been tried many times.
- Fermilab and LIGO have decided to go for 3 hours. The logic here is that it's expected that a critical service could be alerted by monitoring and be fixed within 3 hours.
- Dave: it's important for tokens that get sent to jobs be as short as possible, although it's conceivable that tokens on submit machines be allowed to be slightly longer, for example one day. However, we have been going by the assumption that they will be short lived and so a mechanism for automatic renewal is required. Currently Fermilab has two tools that always get a new access token by running htgettoken every time there is a client command: condor job submission and a locally-written file transfer tool. This creates a dependency on htgettoken that can't reasonably be expected to be extended to all tools that use access tokens. Two other general options have been proposed, but not yet decided upon: (1) Have a background process that automatically refreshes the token until the process that starts it exits (and that also reasonably handles duplicate refresh requests). (2) Have WLCG create a bearer token renewal standard for all tools that is based on the WLCG bearer token discovery. For example, require tools that consume access tokens to execute "$BEARER_TOKEN_RENEWER $BEARER_TOKEN_FILE.renewdata" (where $BEARER_TOKEN_FILE here is a shorthand for wherever the token is found by bearer token discovery) if the .renewdata file exists. Then for htgettoken BEARER_TOKEN_RENEWER could be set to "htgettoken --optfile" and htgettoken would read the file for options that would have it renew the access token and also generate the file when the token is first created.
- Sven: we recently had an incident with CMS (compromised client secrets for 2 clients). There were 4 days when potentially malicious access tokens were still valid.
- Andrew: we may consider a token revocation list that is cheap to query. Alternative, could a token introspection endpoint be run locally? (standard OAuth if using token introspection is to return to the issuer each time you have a token). Is there a possibility to add something into our middleware? Must bear in mind that being able to stick to standard OAuth is a big advantage.
- Petr: If IAM was sufficiently stable then would consider going down to 1 day tokens as a first step. Would still need overlapping tokens to avoid impact of IAM downtime. Agree that the aim is to move towards shorter times.
- Andrei: automated environments like pilot factories, tokens created on the fly and cached locally. Reused for sending pilots and no need for human intervention. In an interactive environment for the user we should really try and make it as user friendly as possible e.g. htgettoken with renewal in background or oidc-agent. With refresh tokens the user is not involved so it is transparent.
- Mischa: maybe we want to revisit doing everything offline (big players seem to use introspection). At the time we thought it would decrease load and help with scaling but maybe this is not true. Maybe we can provide 2 sets of guidelines, short tokens offline or long tokens with introspection. Feeling that working towards e.g. 6 hour upper limit would be good.
- 4 day tokens are temporary measure. All agreed that aim to decrease.
- Need automatic renewal for flows with end users.
- Some feeling that we should revisit whether there is a case for introspection.
- Alexandre: https://github.com/WLCG-AuthZ-WG/common-jwt-profile/issues/24
- Issue that client credentials tokens can't include groups so don't work at any site that expects group based AuthZ
- Suggestion from Petr that relevant clients be made to understand scope based capabilities (this is the only way to submit to HTCondor)
- Did robot certs (extensions) have groups? Yes
- Do we want groups in client access tokens? (Currently not supported by IAM). May be able to fake it by adding a default client scope that looks like a group.
- We didn't think about this when writing the text, although it refers to "end user". This was not forbidden on purpose :)
- Federica: IAM 1.8.1 is released
- Refresh token default lifetime max is changed to 30 days, relevant for oidc-agent. Users would need to run a command to get a new refresh token (for new clients with this default). Existing clients are not affected.
- Can start deploying on WLCG instance at INFN.
- Once verified we can propose release for WLCG IAMs at CERN. Propose in operations meeting on Mondays and go ahead if no objections. Can also send explicit mails (particularly AMBER).
- Maarten: first ALICE jobs with tokens submitted this afternoon!
- Ask CiLogon how they do token verification - offline or introspection (Jim or Jeff)
There are minutes attached to this event.