BiLD-Dev
Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.
Zoom: BiLD
https://cern.zoom.us/j/62504856418?pwd=TU1kb01SOFFpSDBJeWVBdU9qemVXQT09
Meeting ID: 62504856418
Passcode: 12345678
BiLD (Bi-weekly DIRAC Development meeting) – 27/07/2023
At CERN:
On Zoom: Federico, Andrei, Bertrand, Cedric, Daniela, Ewoud, Michel, Simon F, Ueda, Hideki, Janusz, Lorenzo, Simon M, Christophe, Christopher, Dhiraj
Apologies:
Follow-up from previous meetings
- Last “standard” BiLD 5 weeks ago
- Trying to catch all updates below
- Last DIRAC certification hackathon on July 6th
- did not do much due to unavailability of Trello for most
- DiracX hackathon on July 4th and 5th: https://indico.cern.ch/event/1292289/
- 9 total attendants
- Mostly worked in pairs, with Christophe running around (while Chris B was in hospital waiting for Olivia! – who was born July 4th, and all are well!)
- Daniela & Simon send their congratulations :-)
- Progressed on several points, but mostly got a first touch of development for DiracX for most of the attendants.
- Documentation partly updated
- BildX meeting on July 29th
- did not go through all the bullets in the uploaded slides
- recording added to the agenda
DIRAC communities roundtable
LHCb:
Federico+Simon+Christophe+Christopher
- Running on latest v8.0 and latest DIRACOS2 versions
- Using tokens for submitting to all HTCondor CEs
- IAM for LHCb set up, but effectively only for the above purpose
- Using AREX CE instead of ARC/6
EGI
Andrei+Bertrand
- Running on 8.0.24
Juno
Andrei
- Updated to 8.0.24 – rather smooth
- Might duplicate installation - 1 for Juno and a second for other VOs aswell
- Own IAM isntance – pilot submission with tokens set up.
CLIC/ILC/Calice
André
- Migrated to v7r3 (py3)
- Setting up v8 testing (Lorenzo)
Belle2
Hideki, Michel, Ueda
- v7r3 py2
- DIRACOS2 migrated to py3.11, showed incompatibility with one of the external services.
- openssl handshake failure – will open an issue, maybe there’s a workaround
GridPP:
Daniela, Simon, Janusz
- Running 8.0.23 in production due to ARC meltdown in 8.0.24 (addressed here: #7095 and here: #7104 )
- Advising users to install diracos 2.33 and 8.0.21 to avoid the confusing cfg file not found issue: #7088 and mangled prompt: https://github.com/DIRACGrid/DIRACOS2/issues/109 - while technically harmless, we prefer our users to report real errors.
- Federico DIRACOS2 issue was solved within version 2.35
- Added TransformationSystem to production server. Simon not happy with security model, being addressed here: https://github.com/DIRACGrid/DIRAC/pull/7113 (which now has also spawned Federico’s #7124 to address userDN -> username transition)
- Trying to configure one of our preprod instances (https://diracdev.grid.hep.ph.ic.ac.uk:8443/DIRAC/ visible only if you are a member of the gridpp VO) to use tokens. It’s not going well: #7123 (also: #7126)
- Andrei Maybe a temporary solution is retrying with certificates
- We managed to get one job for one VO through using a token for the pilot late yesterday afternoon. However the token tag seems to be CE specific. For us there is no guarantee that all VOs on a CE will be token ready at the same time. Is there a plan to handle this ?
It would really help if we had a working example (certification server ? – then at least all the admins could see it) somewhere, including a description of a working test. Any help is appreciated. - Related: RAL just had an IAM hackathon which Simon and me attended, and we were hoping to have this up an running by then, but this unfortunately failed. The IAM developers are now however aware of our multi-VO everywhere dilemmas, so there might be an opening to get some multi-VO changes through.
Topics from GitHub/Discussions
only un-answered topics with discussion updates:
- NTR
DIRAC releases
- v7r3
- Only few draft PRs still in, almost certainly will be moved to v8.0.
- Need a new release because of https://github.com/DIRACGrid/DIRAC/pull/7118
- v8r0
- v8r1
- NTR
DIRAC projects
DIRAC:
Issues by milestone:
v8.0:
- 15+ open issues
- File with no replicas causes infinite loop in replicate and register process
- Uncaught exception in Tornado
- Needs more info
- Spurious error message
- Seems needing a higher priority
- OptimizationMind exception
- Introducing ‘Scouting’ Status in WMS state transitions
- Done the v7r3 part (introduce the status)
v8.1:
- 15+ open issues
- Federico Closed some of them as not going to be implemented anymore (DiracX taking over)
PRs discussed:
- For Janusz & Federico: Can we have a quick check how far off the remote pilot logging is from release ? #6208
- Federico merged yesteday, ready for being tested in next week’s DIRAC certification hacakthon
WebApp:
- Few PRs
Pilot:
- [devel] fully removed dirac-install and python2 DIRAC client installations
- OK to merge? (
devel
branch)
- OK to merge? (
- Setting up preinstalled DIRAC in a pilot
- Andrei PR updated
- Federico Should be tested in Jenkins (patch needed) and if it would work for LHCb’s case before being merged
DIRACOS:
- Few fixes/workaround, created version 2.35 2 days ago
Documentation:
OAuth2:
- from previous meeting
- Andrei request from EGI to demonstrate that one VO can run with tokens only
- Check In is progressing:
compute
scopes available, they are accepting the idea of using client access tokens (possibility to associate a client to a given VO). They would probably not accept a same client to deal both with client and user access tokens (security concerns with the scopes available in the clients). - WLCG timeline document: https://zenodo.org/record/7014668#.YyLag9JBwQ9
- Still pending the test for ARC7 and CheckIN tokens
- Multiple clients per IdP might be effectively needed
tornado/HTTPs
- from previous meeting Issue https://github.com/DIRACGrid/DIRAC/issues/6495 CLOSED as won’t proceed further
management
- from previous meeting 3 issues left, still valid?
diraccfg
- version 1.0? still tbd
COMDIRAC
Daniela
- Abandoned until we fix tokens and TransformationSystem.
DB12
Alexandre
- Ewoud benchmarking LHCb sites with DB12 and HEPScore in order to compare the 2.
- Andrei what would be the future of DB12? Keep adding factors?
- not clear
- Christopher Python itself is getting faster, so that’s going to be a factor to consider. Depending from that is not ideal.
- Andrei what would be the future of DB12? Keep adding factors?
Rucio
- NTR
Tests
- NTR
DiracX
- Called a 2nd DiracX hackathon: https://indico.cern.ch/event/1304626/ for 4-5 September. Again at CERN, please do register if you plan to attend.
- Also called a BildX meeting for August 31st (the week before the hackathon)
Release planning, tests and certification
Certification machines
- lbcertifdirac70 machine:
- Currently usable
- from previous meeting Federico not rush, but should we move to a Alma9 box?
- Outside of CERN would be better, in CC probably
- Andrei machine is already there, need to decide how to set this one up
- We could also use the new box to test the installation procedure
- Outside of CERN would be better, in CC probably
Next hackathon(s)
- in 1 week, “standard” v8.1.0aX one.
AOB
Next hackathon: On August 3rd
Next BiLD: On August 10th
Half-published the timetable of the DIRAC and Rucio WS: https://indico.cern.ch/event/1252369/timetable/#20231016.detailed
LHCbDIRAC
- v11.0: deploy board in https://trello.com/b/Ep0PAkbv/deploy-110
- Singularity CE everywhere
- LHCbDIRAC hackathon?
*