Dops (unusual date)

Europe/Zurich
2/R-014 (CERN)

2/R-014

CERN

10
Show room on map
Description

The monthly Dops meeting (Dirac(X) operations). This time without a Ddev following it.

Zoom Meeting ID
62504856418
Host
Federico Stagni
Useful links
Join via phone
Zoom URL

Dops – 20/05/2026

At CERN: Federico, Christophe, Alexandre, Ryunosuke, Cedric, Yan
On Zoom: Andrei, Igor, Luisa, Loris, Todor, Mazen, Volodimir, Simon, Daniela, Heloise
Apologies:


this meeting is being recorded

Previous meetings + follow-ups


Communities issues and requests : roundtable

LHCb:

 Federico+Christopher+Christophe+Alexandre+Ryun

  • Moved to version of DiracX that includes tasks. This is already used for cleaning up the sandboxes.
  • Our pods now (the authdb cleanup is done through a helm cron-job): 

Belle2

 Cedric

  • Migration to v9 put on hold, as certification of new BelleDIRAC release is in progress.
    • in prod hopefully in June, after we will restart the migration to v9

Juno+BES3:

 Xiaomei (report by Andrei)

  • Larg-ish re-precessing ongoing, production system rather heavily loaded.
    • CS, FC, ProxyManager overloaded. Some new vo-boxes installed to swallow the load
    • Rescheduled jobs instead of re-submitting tasks, “usual” problems
    • Failover not fully setup, should do
  • fromPreviousMeeting monitorFiles not working correctly for production transformations. Maybe buggy,
    •  Federico+Chris in LHCb a different mechanism is used
    •  Luisa in CTAO also a different solution
    • 20th May  Andrei found a bug, hot-fixed, will commit if tested OK

EGI+IN2P3

 Andrei, Mazen

  • EGI services moved from one set to a new one (Alma9), plus various bumps (MariaDB, Elastic, etc)
    • encountered the MySQL SPAM issue which was fixed in v9 (should be backported)

JINR

 Igor

  • BM@N collected 1.9 PB of experimental data. DIRAC successfully sustained.
  • DIRAC updated to 8.0.78
  • Moved from VOMS to IAM-VOMS.
  • P.S. Sorry for missing on meetings. I am busy on Thursdays.

CTAO

 Luisa, Nattan, Loris, Stella

  • v9+X running on certification instance, also tested the submission of CWL jobs
  • fromPreviousMeeting Use case of possibly 10k short transformations (out of 1 production request)

CLIC

 André

  • NTR

GridPP:

 Daniela, Simon

  • Pre-prod:

  • Mainly used for testing security fixes found by TheClaw^{TM}🦞:

    • Simon notes that most (all) of these could have possibly been found by automated non-AI tests as well: Maybe something for the developer meeting ?
    • We run all patches on pre-prod (v8/9.0.20) and production (v8.0.78), but we run e.g. no transformation/production system
    • They are also installed (on integration) on the certification server, if someone at CERN wants to have look.
  • Documentation:


Releases announcements and reviews

DiracOS

DIRAC

  • v9.1.9 (+ v9.1.8, v9.1.7)
    • No new features, only fixesd

DiracX

  • v0.1.0
    • Tasks fully operational

New documentation:

Dirac-CWL

  •  Ryun ran a CWL job!

DiracX-web

  • first non-alpha release ?
    • not there yet

Pilot

  • NTR

Agenting and AI developments


Feature requests, and developers’ issues: inputs and prioritizations from communities

Request for a long term support release:

The case for:

  • DIRAC has replaced its previous point-release with bug-fix backports model in favour of a rolling-release strategy that shifts maintenance effort onto administrators

  • This change forces administrators, in order to fix a bug they have discovered in their installation (or to apply a security fix), to upgrade to releases introducing new features and therefore potentially new bugs. Automated testing does not find all issues (see e.g. GridPP report from previous meeting:
    GridPP report)

  • Virtually all similarly complex projects rely on stable long term releases for reliability.

  •  Federico There is a request for a “golden release”, at least for moving to DIRAC v9.

Background and concerns of current rolling release model (adopted since 2026), expressed in the diracproject-admins mailing list:

  • the change was announced in advance, but the consequences were not fully foreseen until now
  • not everyone feels like updating version on a “almost weekly” base
  • new versions fix previous bugs but might introduce new ones. At least for updating to v9, a fixed target would be a better fit

 Federicooffers to cherry-pick selected commits in v9.0.X to create a fixed target for the v9 update. But I would not go back to point release.
 AndreiLTS releases?
 ChristopheDIRAC or DiracX?
 allprobably both, at least eventually
We should also sort out the problem with DIRAC<->DiracX versioning, i.e. https://github.com/DIRACGrid/diracx/issues/767
 AlexandreI doubt LTS would help here because:

  • the bugs mentioned were introduced while fixing other bugs (patch release): the would have broken LTS versions the same way
  • we don’t respect the initial contract which was:
    • bug fixes should be transparent
    • new features should be transparent
      For me the underlying issue is a lack of investment in certification: new feature or potentially weird fixes should be certified, and we should potentially have different certification environments. As Danielasaid, error is human, and so when we find a problem we didn’t spot in certification, then we should invest efforts in hardening the certification process
       FedericoWe try to respect the contract as much as possible. I think for the moment we can aim at the “golden v9” release. Once others have moved we should evaluate if LTS are really needed
       AndreiLet’s try but we might end up in the same issue

Discussion will continue in the ML, or if needed in the next DOps.

Monitoring and accounting system for DiracX

  •  Federico sent out a proposal for a system based on a data lake (Duck Lake). No comments received. The usual breakdown in tasks will follow ASAP.
    •  Christophe I am not convinced everything is solvable
    •  Federico Will restart looking into this, but some tests are needed.

Transformation/Production “system” in DiracX

 FedericoI will send out a google form for collection of requirements this afternoon. This will be a rather large one, collection might take longer than the usual “DOps cycle”.

Prioritized backlog: communities input

https://github.com/orgs/DIRACGrid/projects/30/views/3 contains the prioritized backlog.


AOB

  • CMS and DiracX:
    •  Federico traveled to FermiLab last week. Several discussion, but generally looked like a good reception
    • 2 new “admins” added to the diracproject-admins ML: Alan Malta and Andrea Piccinelli
    • 6 new “Users/Devs” added to the diracproject-users ML
  •  Federico also met with FCC representatives just yesterday
    • 2 new “admins” added to the diracproject-admins ML: Juraj Smiesko and David Lange
    • 3 new “Users/Devs” added to the diracproject-users ML
  • Certification machines
    • The diracx-cert setup on OpenShift needs to be updated: tasks!
  • DIRAC is now officially an “HSF affiliated project” : https://hepsoftwarefoundation.org/projects/projects.html
  • CHEP is next week!
    • rehearsal for the 2 parallel talks (from Alexandre and Ryunosuke) happened yesterday: https://indico.cern.ch/event/1688416/
    • Chris Burr will give a LHCb+DIRAC plenary presentation on Analysis Productions
    • Other participants:
      • Dhiraj, Ruslan from Belle2

Next appointments

  • Next meetings:

    • DDev is tomorrow (usual time)
    • DDev will also be next week, even if it’s CHEP (Federico will host)
    • Next DOps(+DDev) on June 18th.
  • WS/hackathons/conferences:

    • DiracX hackathon: 1st and 2nd of July
      • registrations are open, and already several registered. Large representation from CMS.
      • effectively a full room. Capacity extended to 17 (15 registered now)
      • will also organize a social dinner this time, on the 1st of July. Or maybe a picnic near the lake.
    • 12th DUW: 13th-16th October
      • registrations open! Free of charge, thanks to a local sponsor
There are minutes attached to this event. Show them.
    • 10:00 11:00
      Dirac(X) operations (Dops)
      Convener: Federico Stagni (CERN)