BiLD-Dev

Europe/Zurich
2/R-014 (CERN)

2/R-014

CERN

10
Show room on map
Federico Stagni (CERN)
Description

Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.

Zoom: BiLD
https://cern.zoom.us/j/62504856418?pwd=TU1kb01SOFFpSDBJeWVBdU9qemVXQT09

Meeting ID: 62504856418
Passcode: 12345678
 

Zoom Meeting ID
62504856418
Host
Federico Stagni
Useful links
Join via phone
Zoom URL
 
 

BiLD – 31/10/2024

At CERN: Federico, André, Christopher, Alexandre, Ryunosuke, Cedric
On Zoom: Simon, Daniela, Hideki, Andrei, Vladimir, Xiaomei, Janusz
Apologies:


Follow-up from previous meetings

  • Last BiLD was October 3rd
  • Last DIRAC certification hackathon on October 10th
  • CHEP 2024 19th-25th Oct, Krakow
    • Federico points I retained:
      • There is a WLCG Token Trust and Traceability WG that provides the same recommendations that we always heard for proxies ( Daniela: Well, it tried to solve the same problems…)
      • IAM talk. Daniela: Development still driven by WLCG, technical debt, MFA now implemented. It looks like the developers would like to have a proper multi-VO IAM as they clearly get pressured by their own immediate employer to do so :-)
      • Alice is doing interesting stuff with their full-node submission
      • Alice implemented cgroups for subdivision of WNs (we should do the same)
      • “The correlation between HS23 and DB12 (whole) is “almost” acceptable”
      • CMS is overloading
        • Daniela would like to note that LHCb should not be overloading at the Tier2s. If you look at the actual numbers, CMS efficiency even with overloading rarely tops 75% at the Tier2s, while e.g. here at Imperial LHCb hovers around 98%
          • Federico don’t worry, LHCb was not thinking about doing the same!
      • I found only one talk from DUNE
        • Yes, JustIN, by LHCb export A McNab. (Comment by Daniela)
      • Daniela Apart from Federico’s plenary DIRAC was also mentioned in Xiaomei’s JUNO plenary, and twice during parallel sessions (CTA & HERD)

DIRAC communities roundtable

LHCb:

Federico+Alexandre+Christopher+Vladimir

  • Stressing the Transformation System, applying patches
    • having thousands of Transformations, some of them having 10M+ jobs
    • TransformationAgents seem “fine”, WorkflowTaskAgent performances need to improve
  • Stressing also the WMS, some jobs running twice (same jobID picked up by 2 different pilots)
    • Federico will re-check
  • Sending jobs to message broker stops the rest, if the broker is down issues arise. Timeout added, but single lock around logging is problematic
    • patched in latest release
  • error message coming from M2Crypto, sending negative data
    • also patched in latest release

ILC/Calice/FCC

André

  • NTR

EGI

Andrei

  • Setting up token based pilot submission with the production instance of Check-In (biomed)
  • Setting up the diracx test system with K3S in parallel with DIRAC 9.0. Struggling with setting up the basic setup (cert-manager, DEX, …)

Belle2

Hideki, Cedric

  • Completed migration to EL9, and to OpenSearch (still v1)
    • will use apptainer

GridPP:

Daniela, Simon, Janusz

  • We would like to upgrade to the latest v8 version, but we have too many problems with unrenewed proxies. Not every VO went down the WLCG route of 7 day proxies, so we tend to see them more often: https://github.com/DIRACGrid/DIRAC/discussions/7842
    • Andrei the current CEs are not renewing the pilot proxy, so at the moment the only way is to have long pilot proxies
      • Alexandre ARC is doing it (upgrading the delegation), I do not think HTCondorCE does it. Anyway this would not help with the bundled proxy that we are using now
      • proxy renewing itself? does not seem “correct”
      • Andrei proxy delegation thread in SiteDirector?
      • or maybe a Dirac-only internal token/proxy (without VOMS) only for renewing the user proxy
        • maybe re-using the same solution used for the CloudCE. Simon will have a look

Topics from GitHub/Discussions

only un-answered topics with discussion updates:


DIRAC releases


DIRAC projects

DIRAC:

Issues by milestone:

Other issues:

PRs discussed:

WebApp:

  • Sencha request (how it developed)
    • Sencha made an official request “to EGI” asking for license explanation. After some exchange, Sencha got convinced that a paid license is not needed
  • Upgrade ExtJS to 6.6? (right now we are using 6.2)
    • …maybe NOT! (and no volunteers!)

Pilot:

  • from previous meeting Janusz some doc to write

DIRACOS:

Documentation:

  • from previous meeting Need to decide on strategy for DiracX documentation – André to take care?
    • MkDocs ?

OAuth2:

  • NTR

management

  • from previous meeting Always upload releases to CVMFS
    • still not working (did not work for for 8.0.53)
    • Andrei created a new script, so PR needed
  • Daniela We just had a problem for a different VO (comet) where cvmfs would not distribute newly created directories, but happily update new content in old ones - not sure if this could be related ?
    • not related

diraccfg

  • NTR

DB12

  • Igor made studies

Rucio

  • NTR

Tests

  • NTR

DiracX:

Issues

PRs discussed:

DiracX-charts:

  • NTR

DiracX-web:

  • Alexandre
    • Improvements to JobMonitoring app in a large-ish
    • Added the user’s details at login
    • Some tests for the extension needs to be added
  • asked Ryunosuke to be a reviewer for the DiracX-web PRs

Release planning, tests and certification

  • Certification machines

    • No updates, using the old setup at CERN:
      • lbcertifdirac70 for DIRAC code
      • DBs from CERN
      • DiracX running on paas.cern.ch (OpenShift)
  • Next hackathon(s)

    • next week

Next appointments

AOB

  • Projects for ISIMA students
    • Deadline: mid of November

LHCbDIRAC

  • BKK:
    • on :fire:
      • Several queries taking way longer than what they should
    • Chris “won’t survive another year” in the current conditions
    • ConsistencyChecks performance improved
  • Ryun possibility of running on a subset of a BkQuery for prescaling
    • “Merged”
  • Ryun’s DAG graphs: we can run “DAGs” in productions (sequentially)
    • https://gitlab.cern.ch/-/snippets/3303
    • CWL, one day
  • from previous meeting Some runs without luminosity. Total luminosity is there.
    • Frèderic Hemmer to check it
 
There are minutes attached to this event. Show them.
    • 10:00 10:10
      Items from Previous BiLD-Dev 10m
    • 10:10 10:20
      DIRAC Communities roundtable 10m
    • 10:20 10:30
      DIRAC releases 10m
    • 10:30 10:55
      DIRAC projects 25m
      • DIRAC
      • WebApp
      • Pilot
      • DIRACOS2
      • VMDIRAC
      • Documentation
      • OAuth2
      • DiracX
      • other externals (include Rucio)
    • 10:55 11:00
      Release planning, tests and certification 5m
    • 11:00 11:15
      Weekly development(s) focus 15m
    • 11:15 11:25
      AOB
      Convener: Federico Stagni (CERN)
    • 11:25 11:40
      LHCbDIRAC 15m