Rucio Meeting
→
Europe/Zurich
Martin Barisits
(CERN)
Zoom Meeting ID
413496641
Host
Martin Barisits
Alternative hosts
Mario Lassnig, Cedric Serfon, Dimitrios Christidis
Passcode
28849311
Useful links
Join via phone
Zoom URL
-
- 15:00 → 15:05
-
15:05
→
15:15
Community News & DevOps roundtable 10m
- ATLAS
- Issue with Throttler
- Requests remain in WAITING state forever --> #4979
- Way to manually run the throttler to unblock it
- Heartbeats
- With frequent Kubernetes pod-restarts, often outdated heartbeats are found
- Ideally pod-shutdown should issue a heartbeat-removal as well, but doesn't seem to happen #4988
- With frequent Kubernetes pod-restarts, often outdated heartbeats are found
- Issue with Throttler
- CMS
- CTA multihop transfer
- More space on EOS to leave space for multihop
- --> More data (and thus Jobs) sent to EOS
- Possibly based on freespace weights on rules
- Freespace weight could be adapted
- ATLAS investigating if using relative freespace weights instead of absolute could be beneficial
- Freespace weight could be adapted
- More space on EOS to leave space for multihop
- CTA multihop transfer
- Fermilab/DUNE/ICARUS/RUBIN
- Staging failures on tape system
- Number of files in RUBIN
- Large number of files, might be an issue for transfers/tapes
- Belle II
- mod_gridsite issue
- Writes into /var/cache/gridsite
- Many files written there
- Writes into /var/cache/gridsite
- More work on metadata
- mod_gridsite issue
- DUNE
- Work on policy packages for the client
- After that going back to leightweight rucio clients
- Multi-VO
- Conveyor submitter/poller
- After discussion on slack it works much better now
- Radu and Tim implemented a fix which should be able to select the right certficiate for the right VO
- Conveyor submitter/poller
- ESCAPE
- DAC21 (Data and Analysis Challenge 21) next week
- Currently two issues:
- Hermes increases memory consumption until it crashes?
- Kubernetes metrics vs prometheus memory consumption does not add up?
- Reaper greedy deletion
- Found small unrelated bug which is fixed in helm-chart
- LSST usecases --> want to reach 60k deletion / h
- Hermes increases memory consumption until it crashes?
- SKAO
- ATLAS
- 15:15 → 15:20
-
15:20
→
15:40
Suspicious file recovery 20mSpeakers: Christoph Ames (Ludwig Maximilians University Munich (DE)), Cedric Serfon (Brookhaven National Laboratory (US))
- 15:40 → 15:45
-
15:45
→
15:55
Developers roundtable 10m
Rucio 1.27 "Batdonkey v. Superdonkey" priority followup
- In Progress
- Auditor overhaul #3437 [Dimitrios, Eric, Stefan] [Longer activity lasting beyond 1.27 release]
- Let's schedule meeting in November
- Logging review #4220 [Martin, Joel, All comp leads]
- Would be good to get into 1.27
- Quality of Service #3419 [Matt] [Beyond 1.27 release]
- Demonstrator for storage-issued QoS changes for BNL MAS
- DOMA QoS
- Optimize database interactions #4793 [Martin, Mario, Radu] [Beyond 1.27 release]
- Radu started to look into temp tables
- Proof of concept works, but different :-) on Oracle
- Global vs private temporary tables
- Global needs to be part of schema
- Global vs private temporary tables
- Proof of concept works, but different :-) on Oracle
- Meeting with CERN IT DB admins
- Decrease Transaction (7000 transactions/s)
- Recently found issue about session handing between API and Core
- Will be addressed to largely reduce the number of empty rolled back sessions
- Decrease LOGON rate (7-10 logons/s)
- Optimization session pools
- Temporary tables
- Decrease Transaction (7000 transactions/s)
- Radu started to look into temp tables
- Rename Daemons to commonly understandable names #4795 [Martin, Joel]
- Add aliases now, remove them later (if at all)
- If you have suggestions for proper names, please add them to the issue
- Deadline for this next week
- Prepare replacement of current policy import with policy packages #4798 [James, Martin] [Beyond 1.27 release]
- Best method to get policy packages into container?
- Probably not a good idea to pip-install them on startup
- Fermilab + CMS builds separate containers including these
- Best method to get policy packages into container?
- Enabling tests for different policy package #3878 [Mayank]
- New GH action workflow
- Tests are specified via build-matrix
- Specify modifications of rucio.cfg
- Right now it assumes policy package is in container
- Tests are specified via build-matrix
- New GH action workflow
- Auditor overhaul #3437 [Dimitrios, Eric, Stefan] [Longer activity lasting beyond 1.27 release]
- Todo
- Done
- Delayed
Developer roundtable
- Metadata
- PR needs merging, Review is GTG
- In Progress
-
15:55
→
16:00
AOB 5m