BiLD-Dev
Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.
Zoom: BiLD
https://cern.zoom.us/j/62504856418?pwd=TU1kb01SOFFpSDBJeWVBdU9qemVXQT09
Meeting ID: 62504856418
Passcode: 12345678
BiLD – 17/07/2025
At CERN: Federico, Theau, Robin, Cedric, Alexandre, Andrè, Christopher, Ryunosuke
On Zoom: Andrei, Daniela, Simon, Janusz, Jorge, Xiaomei, Hideki, Ueda, Natthan, Alexei, Dhiraj
Apologies:
Previous meetings
- Last BiLD was 3 weeks ago
DIRAC communities roundtable
LHCb:
Federico+Christophe+Alexandre+Vladimir+Robin+Theau
- We found several millions of jobs in JobDB that were part of Transformations in status “Archived” or “Cleaned”. The source is probably the TransformationCleaningAgent [TBC] – PR with fixes: https://github.com/DIRACGrid/DIRAC/pull/8244
- scripts to
- try yourself on your instance: https://gist.github.com/fstagni/2b1df7ac7578bb866321cb02a30bf9de
- find out which jobs should be deleted: https://gist.github.com/fstagni/ab1dd229bbf2210208e0be2785e83de7
- mark those jobs as “DELETED” (the JobCleaningAgent will then take care of them): https://gist.github.com/fstagni/535013d2184978978dec896f762550df
- (run them one after the other)
- scripts to
Belle2
Cedric, Hideki, Ueda, Dhiraj
- WebApp was down due too much memory consumption (maybe due to extension, TBC)
- BelleRAWDIRAC migrating to v9
CLIC
Andrè
- NTR
IHEP
Xiaomei
- from previous meeting Started using chatbot for DIRAC – using llamaindex
- setting up a propotype, for the moment just answering questions about DIRAC (documentation)
- with a DB.
- One student working on it
- Alexandre independently, started building a small MCP server for diracx
- 26th June Junau showed 2 slides with the prototype they have done. The work is mostly about documentation.
- Federico the 2 shuld be integrated one to the other
- Alexandre let’s discuss on mattermost, I have also few ideas
- 17th July ready to combine the work, student will try to push to Alexandre’s repository (not yet done)
- setting up a propotype, for the moment just answering questions about DIRAC (documentation)
EGI:
Andrei
- Started work on the agent to send “energy/CO2” records (from DIRAC, but not only) to the GreenDIGIT MetricsDB service. Working together with Henryk.
- From ops pov, usual struggle for changing the certificates
CTA:
Natthan
- NTR
GridPP:
Daniela, Simon, Janusz
- Nothing to report in production. Submitted some minor patches for “GridPP” usage cases, should not affect anyone else.
Topics from GitHub discussions and bots
- only un-answered DIRAC and DiracX topics with discussion updates:
- Filename for DFC has limit of 128 characters
- “extend freely” from the DIRAC pov, but it’s a slippery slope, and eventually you will find issues with the storage (e.g. EOS – where there’s a 1024 characters limit).
- putting metadata in the filename is in general wrong
- FTS also puts a limitation (at least in DIRAC)
- Andrè linux has 256 limit
- Filename for DFC has limit of 128 characters
Releases
DIRAC
- v8.0.76
- ConfigurationSystem
- FIX: (#8255) Stops VOMS2CSSynchronizer crashing on robot DNs that don’t follow the CERN pattern.
- Resources
- CHANGE: (#8214) SLURM plugin now supports the WholeNode options
- ResourceStatusSystem
- CHANGE: (#8213) Added a token expiry option to dirac-rss-set-status and dirac-admin-allow/ban-site commands.
- ConfigurationSystem
- v9
- Last pre-release contains several changes to the RPCs. Few bugs too. Now needs:
- new DIRACOS release to fix the version of
httpx
- DIRAC PR https://github.com/DIRACGrid/DIRAC/pull/8267
- new DIRACOS release to fix the version of
- Last pre-release contains several changes to the RPCs. Few bugs too. Now needs:
diracx
- v0.0.1a46 created 2 days ago
- We should remember to generate a new release every time there’s a change in the (generated) client
Release planning, tests and certification
The flowcharts below details the interactions and constrains between some of the existing PRs (in DIRAC, diracx, and Pilot) for v9+0.1 and v9.1+0.2 versions.
9.1
the new DB table for pilot auth
0.2
9.1
0.2
0.2
9.0
9.0
0.1
0.1
0.1
0.1
9.0
We only looked at the v9+0.1 chart. Most of the PRs involved are from Robin.
-
Certification machines
- NTR
-
Next hackathon(s)
- As soon as we’ll get “proper” releases
DIRAC projects
DIRAC:
Issues by milestone:
- from previous meeting Replacement for BDII2CSAgent #8194
- “nice” discussion
- Daniela is going to check if (as she suspects) GOCDB is actually underwritten by GridPP. Maybe (can’t promise) “finances” might be an angle.
- 26th June no updates in ticket in the last 2 or 3 weeks
- 17th July
- recent answers in the GGUS ticket above from AP seems to point in the right direction
- nevertheless, Federico is trying to grab info on if we can do a “CEs” crawler ourselves. ARC CEs seem to provide the necessary information, issues are from HTCondorCEs
- Last message from Federico:
For those of you who are running a HTCondor CE, would you mind investigating if the content of auth-map + accounting-map files could be made public? Maybe we are lucky and it’s a trivial thing to be done.
- Daniela will check the above
- “nice” discussion
PRs discussed:
- add scitag support
- draft PR waiting for Christophe review. More comments in it
- several VOs (including LHCb, Belle2, ilc) can already provide scitags (recognized with the APIs)
- from previous meeting PoolCE and RAM (issue raised in https://github.com/DIRACGrid/DIRAC/issues/7853#issuecomment-2948565279): https://github.com/DIRACGrid/DIRAC/pull/8232
- 17th July PR updated
WebApp:
- New alhpa version created (nothing important)
Pilot:
- from previous meeting Pilot migration
- diracx
pilot
route
- diracx
- from previous meeting PR feat: Adding JWT support alongside X509 auth
- the new Pilot command can call directly the route, no need to use the CLI
- the integration tests for this will be set up once diracx is updated with the connected diracx PR
DIRACOS:
- New release (https://github.com/DIRACGrid/DIRACOS2/releases/tag/2.53) for avoid picking up
httpx
dev versions - dropped PPC (seems “dead”)
Documentation:
- web.diracgrid.org
- for the moment deployed at gitlab at Lyon. Will be moved to standard DIRACGrid github repo at some point
- the suggestion is to do it already
- Andrei is it OK to deploy (the landing pages) on github pages?
- from previous meeting tasks: https://github.com/DIRACGrid/DIRACx/issues?q=is%3Aissue state%3Aopen label%3Adocumentation
- from previous meeting on diracx.io: ChrisB registered it under his name more than a year ago. We do anyway have diracgrid.org (that is also “correctly billed”) so it makes sense to use only that domain. Chris is therefore proposing to move to:
- diracx.diracgrid.org (diracx docs, currently diracx.io)
- charts.diracgrid.org (for the helm chart, would have been destined to be charts.diracx.io if we’d leave it as is)
- then we’d need common access for managing the DNS on diracgrid.org to set the CNAME records.
management
- from previous meeting new
/cvmfs/dirac.cern.ch
repository created – CERN ticket- question on if it could be
/cvmfs/diracgrid.org
– potentially, but that would not be automounted - action on @cburr to populate it (using LHCb “machinery”)
- Christophe I will propose some ideas on the structure in one or 2 BiLD meetings
- Federico You do not have full flexibility, as the LHCb structure has been copied to DIRAC and IIRC Belle2 too
- question on if it could be
DB12
- NTR
Rucio
- NTR
Tests
- Robin improved
integration_tests.py
to add DiracX service(s) – basically for testing the legacy adaptors – almost done - from previous meeting Federico Started adding Rucio to Dirac integration tests
- –> to Janusz
DiracX:
- Road Map : https://github.com/DIRACGrid/diracx/blob/caf66076bc8b623b0282a2f3d7d723b7c45be1a6/docs/roadmap.md?plain=1
- the only thing left to complete before v0.1 is documentation
- dependabot alert for web
- NTR
Issues
- Architecture design principles:
- Issue on Add Architecture Design Records (ADR) to the documentation created following comment (which was not only on documentation)
- the idea is accepted
- from previous meeting Federico wrote down an “epic” with 5 sub-tasks: https://github.com/DIRACGrid/DIRACx/issues/562 for the DiracX accounting/monitoring after consulting with CERN experts
- 17th July We gave Ewoud (CERN IT) a task to try this one out on a 6-months popularity data (from LHCb). Current stoppers:
- Grafana does not yet support plotting of OpenSearch rolled-up data (seems to be OK with data from ElasticSearch though, so probably something that will eventually be sorted out)
- More worringly, rollup jobs do not seem to work on DataStreams (work only on standard indices)
- 17th July We gave Ewoud (CERN IT) a task to try this one out on a 6-months popularity data (from LHCb). Current stoppers:
- Comments to https://github.com/DIRACGrid/diracx/issues/585#issuecomment-3051642916
- Federico DELETE Jobs should not be there – mark as
JobStatus.DELETED
instead - Andrei is the fact that we do not use RESTful going to have repercussion on the tools?
- we do not exactly know
- Christopher Probably worth reading https://florian-kraemer.net//software-architecture/2025/07/07/Most-RESTful-APIs-are-not-really-RESTful.html
- Federico DELETE Jobs should not be there – mark as
PRs discussed:
- Alembic and triggers
- asked Cedric to review it
- from previous meeting Add pilot management: create/delete/patch and query #570
- PR looks OK-ish to merge (should not affect existing running code)
- from previous meeting feat: deploy gubbins images #527
- waiting for https://github.com/DIRACGrid/diracx-charts/pull/158 to be approved and merged
- 17th July [MERGED]
- waiting for https://github.com/DIRACGrid/diracx-charts/pull/158 to be approved and merged
- from previous meeting Is fix: #448 and smarter datetimes #454 ready?
- Ryun will try to finalize ASAP
DiracX-charts:
DiracX-web:
-
security advisory https://github.com/DIRACGrid/diracx-web/security/advisories/GHSA-hfj7-542q-8fvv#event-474163
-
Merged search bar (will be used at least for the JobMonitor) which mimics what’s done for gitlab. The web components for searching can be re-used for other apps, e.g. below thet are used for lhcbdiracx-web bookkeeping app:
-
from previous meeting Web user-facing documentation: https://github.com/DIRACGrid/diracx-web/pull/367#pullrequestreview-2950696755
- Conclusion: leave the user documentation in
/docs
, read it from the code and present it to the users
- Conclusion: leave the user documentation in
-
from previous meeting Theau implemented a way for share app states (dump in JSON, reload through web)
- follow-up issue: https://github.com/DIRACGrid/diracx-web/issues/348
- 5th June the new proposal is to have a DB where to stored snippets for the state. Basically, like the UserProfileDB in DIRAC.
Next appointments
-
Meetings:
- BiLD: September 4th (tentative)
-
WS/hackathons/conferences:
-
DIRAC Users’ Workshop
📅 17–20 September 2025 | 📍 IHEP, Beijing, China
🔗 indico.cern.ch/e/duw11
* Connect with the DIRAC community
* Meet the developers
* Discover DiracX
* Share your insights
The workshop will feature talks from users, administrators, and developers — plus a hands-on DiracX Hackathon.Poster in https://cernbox.cern.ch/remote.php/dav/public-files/1egpqkseV0GvqlO/DIRAC_Poster.pdf – make use of it!
-
AOB
- from previous meeting DIRAC as an “HSF affiliated project” : https://hepsoftwarefoundation.org/projects/affiliated.html
- No news
LHCbDIRAC
- Release:
- yesterday’s release patched in place (
pip install httpx==$the_version
) + other minor hotfixes - needs new stack and new deploy
- lhcbdiracx-web also missing latest version of https://www.npmjs.com/package/@dirac-grid/diracx-web-components
- ChrisB will take care of the release next week
- yesterday’s release patched in place (
- Bookkeeping MRs:
- add materialized view for processing paths optimization
- this should be merged first and the next ones use it
- Alexey it does not work for me, see comment in MR
- reworked addProcessing
- reworked getproductionporcpassname
- Removing registration of LOG files in bookkeeping
- add materialized view for processing paths optimization
- from previous meeting StorageReport app available in https://lhcbdiracx-cert.app.cern.ch/
- few minor issues still to be sorted out before accepting the PR
- 26th June pending some code re-organization