Rucio Meeting

Europe/Zurich
Martin Barisits (CERN)
Zoom Meeting ID
413496641
Host
Martin Barisits
Alternative hosts
Mario Lassnig, Cedric Serfon, Dimitrios Christidis
Passcode
28849311
Useful links
Join via phone
Zoom URL
    • 15:00 15:05
      News 5m
      • DC24 Rucio changes
        • Optimisation for Submitter #6505
          • Submission of requests whose rule is already expired (or about to expire)
          • Creates unncesseray contention, since the request needs to be cancelled right after 
          • Makes sense post DC24 as well
    • 15:05 15:25
      Community News & DevOps roundtable 20m
      • ATLAS
        • DC24
          • On Rucio side things work mostly as expected
            • Some optimisations identified
            • Tokens disabled for ATLAS for the last few days to focus on throughput
        • scitags issue identified
          • Rucio does not cache the scitag json, thus it fetches it repeatedly
          • Not blocking issue, but wasteful
          • Will be fixed in next release #6501
      • CMS
        • #6434 now to be discussed that Eric is back :-)
        • DC24
          • Running as well as to be expected
          • 50% of throughput with Tokens
          • Injection issue with rules
            • PK Constraint hit with add_rule and the replicas being in the process of being deleted
              • Possibly caused by slow deletions (being tracked)
            • Bugfix for injection script to skip the "problematic" datasets where rule creation fails
              • To just fill the pipe with other datasets
      • Fermilab RUBIN, DUNE, etc
        • ICARUS
          • Submit transfers to cern-fts3
            • Works for DUNE, for ICARUS it does not
            • Delegation expiration errors also seen in ATLAS as well
              • Needs to be followed up by FTS (probably only one machine or so affected)
              • CMS saw similar problems and identified race conditions (Multiple proxies delegations at the same time)
        • DUNE
          • Participating in DC24
            • From a performance view it runs very well
            • Data management: Misunderstanding what did expirations mean/do
        • RUBIN
          • Working on automatix
          • Integrating metadata system into Rucio
          • Setting up custom software on tape RSEs
      • ESCAPE
      • DUNE/Edinburgh
        • DUNE specific test-suite
          • Duplicating integration tests to get some dune specific containers in there (metacat)
        •  
    • 15:25 15:55
      Developers roundtable 30m
      • Rucio 34 "Donkey Potter and the Data Cache" roadmap
        • In Progress
          • foreign key error on deleting dids in reaper #5733 [Alex]
            • Mostly a conceptual discussion -> Discussion with Martin
          • factorize duplicate messaging code into a common module or class #6423 [Alex]
          • Deployment and Release Workflow #401 [Mayank, Eraldo]
            • X509 blocker resolved 
            • New container release procedure to make individual webui releases
          • Missing WebUI Release 33 page tracker #301 [Mayank, Eraldo]
          • Migrate Dashboard to Clean Architecture #158 [Mayank, Eraldo]
          • Unable to Delete File DID via Undertaker #5154 [Riccardo]
            • Refactoring of daemons first, Review being adressed now
              • Should daemons catch a CTRL+C?
                • Yes
          • Type annotate the code #6454 [Riccardo]
            • Pushed first PR for this
            • In review discussion
              • Once resolved, then more PRs to come for this release
          • Update extension for v32 (and higher) compatibility #25 [Francesc, Enrique]
            • Minor documentation issues being worked on
            • Some progress, not done yet
        • In Review
          • Continue migration to SQLAlchemy 2.0 syntax #6057 [Erling]
            • 5 PRs due to size (Two are submitted upstream, but other 3 can be pushed)
          • Refactor policy package algorithm code #6382 [James]
          • Metadata for tape co-location and transfer prority #6398 [Maggie]
          • Update/Re-design core.meta module #5224 [Maggie, Rob]
        • Todo
          • bridge the gap between running rucio in demo env and full production deployment #187 [Radu, Enrique]
            • Needs somebody else on this, now that Radu has left
        • Done
          • Add Token based TPC tests to the CI #6451 [Radu]
        • Delayed
      • Documentation corner
        • Documentation and dev guidelines for Mypy type annotations #116 [Mayank, Martin]
        • Document environmental variables affecting the client #171 [Dimitrios]
        • Improve documentation on rucio.cfg vs configuration table #183 [Radu]
        • Add an FAQ-style entry aimed at users for STUCK rules #184 [Fabio]
        • Add instruction about DB partitioning #185 [Martin]
        • bridge the gap between running rucio in demo env and full production deployment #187 [Radu]
        • Introduce documentation on subscriptions #190 [Cedric]
        • WebUI: Improve Docs #255 [Eraldo]
        • Add instructions for Mac Apple Silicon in the developer section #261 [Eraldo]
          • Under Review - Comments posted, needs iteration
        • Add Rucio QoS RSE description and instructions #268 [Matt]
          • Under Review - Comments posted, needs iteration
        • Document how to set up command line argument completion #275 [Bouwe]
        • Formatting / style guide #287 [???]
        • Document how deletion occurs. #288 [???]
      • Other topics
        • ca_cert bundle for webui
          • Should be the same as for server/daemons
    • 15:55 16:00
      AOB 5m