Dops (unusual date)
The monthly Dops meeting (Dirac(X) operations). This time without a Ddev following it.
Dops – 20/05/2026
At CERN: Federico, Christophe, Alexandre, Ryunosuke, Cedric, Yan
On Zoom: Andrei, Igor, Luisa, Loris, Todor, Mazen, Volodimir, Simon, Daniela, Heloise
Apologies:
this meeting is being recorded
Previous meetings + follow-ups
- Dops about 5 weeks ago. Follow-ups
- Jobs’ match-making (matching) mechanism for DiracX: issues and plans
- “epic”: https://github.com/DIRACGrid/diracx/issues/843
- match-making logic in python, now doing with Redis and Lua, and stress-testing with Locust
- CWL is coming to DiracX, and the new “hints”: https://codimd.web.cern.ch/SllN13jAQNSG25MjHB8Swg?both
- Ryun ran a few jobs in Dirac(X) certification via CWL
- “Large” PR is ready for being merged.
- Jobs’ match-making (matching) mechanism for DiracX: issues and plans
Communities issues and requests : roundtable
LHCb:
Federico+Christopher+Christophe+Alexandre+Ryun
- Moved to version of DiracX that includes tasks. This is already used for cleaning up the sandboxes.
- Our pods now (the authdb cleanup is done through a helm cron-job):

Belle2
Cedric
- Migration to v9 put on hold, as certification of new BelleDIRAC release is in progress.
- in prod hopefully in June, after we will restart the migration to v9
Juno+BES3:
Xiaomei (report by Andrei)
- Larg-ish re-precessing ongoing, production system rather heavily loaded.
- CS, FC, ProxyManager overloaded. Some new vo-boxes installed to swallow the load
- Rescheduled jobs instead of re-submitting tasks, “usual” problems
- Failover not fully setup, should do
- fromPreviousMeeting
monitorFilesnot working correctly for production transformations. Maybe buggy,- Federico+Chris in LHCb a different mechanism is used
- Luisa in CTAO also a different solution
- 20th May Andrei found a bug, hot-fixed, will commit if tested OK
EGI+IN2P3
Andrei, Mazen
- EGI services moved from one set to a new one (Alma9), plus various bumps (MariaDB, Elastic, etc)
- encountered the MySQL SPAM issue which was fixed in v9 (should be backported)
JINR
Igor
- BM@N collected 1.9 PB of experimental data. DIRAC successfully sustained.
- DIRAC updated to 8.0.78
- Moved from VOMS to IAM-VOMS.
- P.S. Sorry for missing on meetings. I am busy on Thursdays.
CTAO
Luisa, Nattan, Loris, Stella
- v9+X running on certification instance, also tested the submission of CWL jobs
- fromPreviousMeeting Use case of possibly 10k short transformations (out of 1 production request)
CLIC
André
- NTR
GridPP:
Daniela, Simon
-
Pre-prod:
-
Mainly used for testing security fixes found by TheClaw^{TM}🦞:
- Simon notes that most (all) of these could have possibly been found by automated non-AI tests as well: Maybe something for the developer meeting ?
- We run all patches on pre-prod (v8/9.0.20) and production (v8.0.78), but we run e.g. no transformation/production system
- They are also installed (on integration) on the certification server, if someone at CERN wants to have look.
-
Documentation:
- DiracX in a container documentation finally merged: https://github.com/DIRACGrid/diracx/pull/851 If you have any questions, please ask Simon and me.
Releases announcements and reviews
DiracOS
-
OK to remove the CentOS7 support (follow up from https://github.com/DIRACGrid/DIRACOS2/pull/181#issuecomment-4312989818): ++new issue/PR
-
2.61 is the last version
- nothing new since previous DOps
- fromPreviousMeeting 2 issues opened by Daniela
- dependencies https://github.com/DIRACGrid/DIRACOS2/issues/173
- include latest htcondor: https://github.com/DIRACGrid/DIRACOS2/issues/169 (not urgent, however I was hoping for the latest long term support release --Daniela)
DIRAC
- v9.1.9 (+ v9.1.8, v9.1.7)
- No new features, only fixesd
DiracX
- v0.1.0
- Tasks fully operational
New documentation:
- deploy in containers merged: https://github.com/DIRACGrid/diracx/pull/851
Dirac-CWL
- Ryun ran a CWL job!
DiracX-web
- first non-alpha release ?
- not there yet
Pilot
- NTR
Agenting and AI developments
Feature requests, and developers’ issues: inputs and prioritizations from communities
Request for a long term support release:
The case for:
-
DIRAC has replaced its previous point-release with bug-fix backports model in favour of a rolling-release strategy that shifts maintenance effort onto administrators
-
This change forces administrators, in order to fix a bug they have discovered in their installation (or to apply a security fix), to upgrade to releases introducing new features and therefore potentially new bugs. Automated testing does not find all issues (see e.g. GridPP report from previous meeting:
GridPP report) -
Virtually all similarly complex projects rely on stable long term releases for reliability.
-
Federico There is a request for a “golden release”, at least for moving to DIRAC v9.
Background and concerns of current rolling release model (adopted since 2026), expressed in the diracproject-admins mailing list:
- the change was announced in advance, but the consequences were not fully foreseen until now
- not everyone feels like updating version on a “almost weekly” base
- new versions fix previous bugs but might introduce new ones. At least for updating to v9, a fixed target would be a better fit
Federicooffers to cherry-pick selected commits in v9.0.X to create a fixed target for the v9 update. But I would not go back to point release.
AndreiLTS releases?
ChristopheDIRAC or DiracX?
allprobably both, at least eventually
We should also sort out the problem with DIRAC<->DiracX versioning, i.e. https://github.com/DIRACGrid/diracx/issues/767
AlexandreI doubt LTS would help here because:
- the bugs mentioned were introduced while fixing other bugs (patch release): the would have broken LTS versions the same way
- we don’t respect the initial contract which was:
- bug fixes should be transparent
- new features should be transparent
For me the underlying issue is a lack of investment in certification: new feature or potentially weird fixes should be certified, and we should potentially have different certification environments. As Danielasaid, error is human, and so when we find a problem we didn’t spot in certification, then we should invest efforts in hardening the certification process
FedericoWe try to respect the contract as much as possible. I think for the moment we can aim at the “golden v9” release. Once others have moved we should evaluate if LTS are really needed
AndreiLet’s try but we might end up in the same issue
Discussion will continue in the ML, or if needed in the next DOps.
Monitoring and accounting system for DiracX
- Federico sent out a proposal for a system based on a data lake (Duck Lake). No comments received. The usual breakdown in tasks will follow ASAP.
- Christophe I am not convinced everything is solvable
- Federico Will restart looking into this, but some tests are needed.
Transformation/Production “system” in DiracX
FedericoI will send out a google form for collection of requirements this afternoon. This will be a rather large one, collection might take longer than the usual “DOps cycle”.
Prioritized backlog: communities input
https://github.com/orgs/DIRACGrid/projects/30/views/3 contains the prioritized backlog.
- objections?
- something from https://github.com/orgs/DIRACGrid/projects/30/views/7 ?
AOB
- CMS and DiracX:
- Federico traveled to FermiLab last week. Several discussion, but generally looked like a good reception
- 2 new “admins” added to the diracproject-admins ML: Alan Malta and Andrea Piccinelli
- 6 new “Users/Devs” added to the diracproject-users ML
- Federico also met with FCC representatives just yesterday
- 2 new “admins” added to the diracproject-admins ML: Juraj Smiesko and David Lange
- 3 new “Users/Devs” added to the diracproject-users ML
- Certification machines
- The diracx-cert setup on OpenShift needs to be updated: tasks!
- DIRAC is now officially an “HSF affiliated project” : https://hepsoftwarefoundation.org/projects/projects.html
- CHEP is next week!
- rehearsal for the 2 parallel talks (from Alexandre and Ryunosuke) happened yesterday: https://indico.cern.ch/event/1688416/
- Chris Burr will give a LHCb+DIRAC plenary presentation on Analysis Productions
- Other participants:
- Dhiraj, Ruslan from Belle2
Next appointments
-
Next meetings:
- DDev is tomorrow (usual time)
- DDev will also be next week, even if it’s CHEP (Federico will host)
- Next DOps(+DDev) on June 18th.
-
WS/hackathons/conferences:
- DiracX hackathon: 1st and 2nd of July
- registrations are open, and already several registered. Large representation from CMS.
- effectively a full room. Capacity extended to 17 (15 registered now)
- will also organize a social dinner this time, on the 1st of July. Or maybe a picnic near the lake.
- 12th DUW: 13th-16th October
- registrations open! Free of charge, thanks to a local sponsor
- DiracX hackathon: 1st and 2nd of July