- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Previous Actions:
Proposed agenda:
Zoom meeting:
Link below, in the videoconference section. Please ensure you are signed in to Indico to see the meeting password!
Next Meeting:
Present: Angela, Berk, Dave D, Dimitrios, Hannah, John, Julie, Linda, Maarten (notes), Mischa, Petr, Stephan
Apologies: CNAF team, Tom
Notes:
Berk reports that since Wed afternoon, a number of small VOs now have their IAM instances in HA mode on K8s: there are 3 different clusters and when 1 of them goes down, the others will keep providing the services. The next VO for trying out this deployment could be ALICE, which has the least IAM activity of the big experiments, followed by the others at some point. Stephan points out the K8s instances are not yet being used in production and would therefore not see much activity anyway. Maarten concludes we would be switching from OpenShift straight to the HA setups in K8s, instead of the simple setups we have there now, but that we anyway need to move to HA mode at some point and might as well do that in one go. Furthermore, the OpenShift instance for an experiment will remain available for a while, should there be some unexpected problem with its K8s HA setup. We do not expect any, because the setup plans have been studied for a few months and they imitate to some extent what is done for various other services since a few years already. Added after the meeting: the HA deployment for IAM is tricky due to the application requirements. Stephan asks what is the status of the campaign for sites to support the K8s instances? Maarten answers there still are about 60 tickets open. After some discussion it is agreed that the remaining sites should be reminded early September with a new deadline 2 weeks later, after which the SAM tests could be switched per experiment to its K8s instance. Hopefully production workflows can start depending on the K8s instances around early October. Stephan fears we may need more time, because of various issues affecting in particular the ARC CE tests, which may only get fixed after the ETF containers have been moved to AL9. (Added after the meeting: those matters are independent of the token issuer.)
Maarten adds that when an experiment has fully switched to IAM on K8s, its OpenShift instance will be decoupled, viz. given its own DB, to allow it to be used in data management stress tests that we foresee doing this autumn, to check the sustainability of new arrangements between Rucio, DIRAC, FTS and IAM.
Next, Berk reports on testing the 1.10.0 and 1.10.1 pre-releases of IAM, which originally were planned for July, but had to be postponed a bit. The former was found to have bug that has been fixed in the latter. The new version brings various improvements, in particular e-mail notifications concerning AUP reminders, user suspensions and restores, as well as the automatic subscription to default groups. The plan is to upgrade the IAM instances to the official 1.10.x release in the last week of August, when Berk is back from a holiday break.
Maarten adds that we are steadily running out of high-priority issues concerning VO management (which is a good thing) and that we will need to define a new set of high-priority issues, in particular in view of the data management stress tests this autumn. For example, IAM needs to stop storing access tokens in the DB. Stephan proposes a new dashboard to be created with just those issues listed. Maarten points out that the IAM devs also want to start focusing on IAM v2, which is based on more modern, well supported, underlying frameworks, instead of patching too many things still in v1. Petr would like the devs to propose a timeline for v2.
Next, Petr asks if CMS are already using tokens for ARC everywhere? Stephan replies they are used for about half of the ARC CEs and that further progress depends on various ETF issues concerning ARC CE jobs getting resolved in the near future. Petr suggests there might still be work needed in porting from Python2 to Python3 as well? Stephan answers that for CMS that work has been done and that the remaining issues are with the container used by the ETF, for example the HTCondor version used therein.
Finally, Petr asks if there is news about the EGI document about token lifetimes? Mischa provides the link to that document and answers it has been discussed to some extent in AARC meetings. Maarten adds that the link was already posted in the Token Trust and Traceability WG, which is better suited for discussing best practices and policies. See the agenda of its meeting on July 23.