BiLD-Dev
Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.
Zoom: BiLD
https://cern.zoom.us/j/62504856418?pwd=TU1kb01SOFFpSDBJeWVBdU9qemVXQT09
Meeting ID: 62504856418
Passcode: 12345678
BiLD – 07/03/2024
At CERN: Christopher, Christophe
On Zoom: Federico, Alexandre, Janusz, Daniela, Simon, Andrei, Hideki, Janusz, Vladimir, Alexey
Apologies: André
Follow-up from previous meetings
- Last BiLD on Feb 15th
- Last DIRAC certification hackathon 1 week ago:
- still had issues
- Last BiLDx 5 weeks ago
- cancelled the one of 2 weeks ago
DIRAC communities roundtable
LHCb:
Federico+Alexandre+Christophe+Christopher
- Running on latest DIRAC v8 patch
- EL9: started moving hosts to Alma9
Belle2
Cedric, Hideki
- 2 issues:
- bug in Belle2 for setting up DIRAC from CVMFS
- 1024 bit proxies.
- Now pilot works OK
- How LHCb did to for changing proxies?
- we always uploaded proxies (for a while) with a hotfix
ILC/Calice/FCC
André
- NTR
EGI
Andrei
- Users submitting long jobs, and found out that the proxy renewal mechanism was not working – fixed in latest v8.0.39 patch
Juno/BES3
Xiaomei
- In WebApp, should specify extension version when upgrading.
GridPP:
Simon, Janusz, Daniela
- No news.
Topics from GitHub/Discussions
only un-answered topics with discussion updates:
DIRAC releases
- v8r0
- v8.0.39
- “just” fixes, most notably https://github.com/DIRACGrid/DIRAC/pull/7479
- v8.0.39
- v9r0
- NTR
DIRAC projects
DIRAC:
Issues by milestone:
-
v8.0:
- 10+ open issues, usual reminder for closing/moving old ones
- dirac-dms-find-lfns misbehaves if given an invalid path
-
v9.0:
- NTR
- [8.0] Implementation of metadata methods into RucioFileCatalogClient closes issue #7382
PRs discussed:
- [8.0] Elastic: index creation only at indexing time
- To avoid indices with 0 documents
- [9.0] remove notification db
- Federico started simplification, ended up removing the whole system
- question: does anyone ever used this DB? – seems not
- Will be merged now if no objections
- Federico started simplification, ended up removing the whole system
- [9.0] feat: split JobWrapper.execute() into 3 submethods
- Only refactoring. First of several PRs
- Federico would like to use this for integration testing workflows
- https://github.com/DIRACGrid/DIRAC/pull/7492#discussion_r1514680164
- Christophe I can’t remember all details, but we should not contact the CS too much, especially not at handshaking time
WebApp:
- NTR
Pilot:
- Merge of
develtomaster: we are actually ready. Federico will do it on Monday in LHCb, and on Tuesday for anyone else, if I don’t find last minute issues - from previous meeting Janusz some doc to write
- Federico We do lots of mangling for using older python version. If CVMFS is available (this seems to be anyway a requirement) we can use the python (3.11) coming from CVMFS, and remove lots of legacy code.
- Chris²+Andrei can not do that
- We can not even declare py2 “dead” when CentOS7 will get EOL, let’s talk again in ~1 year time
- Chris²+Andrei can not do that
DIRACOS:
- from previous meeting Made a new release with
libxml2downgraded. Issue opened togfal2for proper fix- FTS/gfal developers are going to look at this now
- We might not be able to update globus again (OpenSSL version)
- SRM still depends on it
Documentation:
- NTR
OAuth2:
- NTR
management
- Always upload releases to CVMFS
- Andrei Made a few updates, for ARM release. In contact with administrator, the ball is on his side.
- from previous meeting
- Also the upload of pilot files to CVMFS could be done same way
- Right now the cron-job runs hourly
- Few issues discovered (the releases were not properly propagated – CVMFS issue at RAL)
- from Stratum0 to Stratum1
- fixed by hand from time to time
- https://cvmfs-monitor-frontend.web.cern.ch/ might be needed
diraccfg
- NTR
DB12
Alexandre
- from previous meeting
- Ewoud opened PR for py3.11 (in progress)
- missing the Intel vs AMD
- Federico Can we have DB12 run on ARM
- André : I access two “real” ARM machines through OpenStack that can be used for the tests
- Federico: passed to Ewoud
- hepscore and ARM: https://ggus.eu/index.php?mode=ticket_info&ticket_id=164939 (Imperial’s ARM is still very much in the “setting it up stage”
- Ewoud opened PR for py3.11 (in progress)
Rucio
- NTR
Tests
- Federico Repo for running “nightly” system tests targeting the DIRAC cert setup: still, did not find the time to do it
Release planning, tests and certification
-
Trello: up to 10 collaborators:
- Christopher proposes to use Github projects:
- https://github.com/orgs/DIRACGrid/projects/9/
- GitHub has a concept of templates so we could use that. Or we could use the script I used to import it to generate a new board for each hackathon from a YAML spec: https://gist.github.com/chrisburr/0c4f48421e02e01286696453710c3028
- Looks appropriate
- Christopher proposes to use Github projects:
-
Certification machines
- lbcertifdirac70 machine:
- we should move to EL9 box in Q1 2024 (CC7 EOL June 2024)
- As said, this should be done outside of CERN
- Volunteers?
- 2 options:
- completely new setup (new DBs). Federico I would prefer this one
- keep the current CERN DBs (MySQL, OpenSearch), + Grafana.
- As said, this should be done outside of CERN
- we should move to EL9 box in Q1 2024 (CC7 EOL June 2024)
- lbcertifdirac70 machine:
-
Next hackathon(s)
- in 2 weeks (using github “projects”)
AOB
Next BiLDX: March 14th
Next hackathon: March 21st
Next BiLD: after Easter (April 4th?)
- Next DIRAC Users’ workshop in Lyon https://indico.cern.ch/e/duw10 June 19th->21st
- most probably, no fee requested
- registrations can open now

- Next DIRAC+X hackathon: https://indico.cern.ch/event/1376672/
- 9-10 April
- as usual, do register to participate
LHCbDIRAC
- v11.0: deploy board in https://trello.com/b/Ep0PAkbv/deploy-110
- Let’s quickly review what remains here…
- https://gitlab.cern.ch/lhcb-dirac/LHCbDIRAC/-/issues
- Moving
Job finalizationstep from the workflow to theJobWrapper:…- Federico the workflow modules are delicate and unit tests are not enough. There are (they were more up-to-date time ago) integration tests (“real jobs” ran in “real” pilots). They should be restored, maybe in a different manner.
-
I propose that our production workflow be streamlined to consist solely of a sequence of GaudiApplication modules.
- Run only GaudiApplication module in the step
- CWL: what are the boundaries between DIRAC and DiracX?
- Bookkeping:
- Extending the bookkeeping to support SMOG2
- Should be done, but of course not exercised in production
- Regarding the bullte points in https://gitlab.cern.ch/lhcb-dirac/LHCbDIRAC/-/issues/23#note_7633197 : we can test this with a Analysis Production
- Extending the bookkeeping to support SMOG2
- Moving
- VOMS backend will be shutdown in June 24
- from previous meeting we need to make sure that all links will be updated
- Christophe will do adiabatically over the coming weeks
- from previous meeting we need to make sure that all links will be updated
- OpenSearch migration?
- no date yet
- Alma9
- Puppet ready
- lbvobox300 migrated, 1 new will be done today
- 2 machines (lbvobox900 and lbvobox901) created by Joel
- what to do with these ones?
- https://github.com/DIRACGrid/DIRAC/issues/7368