Rucio Development Meeting

Europe/Zurich
4/S-056 (CERN)

4/S-056

CERN

20
Show room on map
Martin Barisits (CERN)
    • 15:00 15:10
      News 10m
      • Rucio Coding Camp 2019 planning next development meeting (Sep 26)
      • No development meeting Oct 3 (ATLAS S&C week)
    • 15:10 15:30
      News from the experiments 20m

      ATLAS

      • Problem yesterday after network issue. DDoS of backend servers put Rucio on the knees. Will implement throttling on the HAproxy side.

      CMS :

      • 2nd round of 1M file test. Check that all monitoring is in test. Progress on quotas, next week final report from student working on it , will restart discussion with CRIC.

      RAL :

      • No news on multi-VO support, was preparing the face to face meeting on Thursday.

      ESCAPE :

      • No news
    • 15:30 15:50
      Hot topics 20m
      • Rucio demo container?
        • Does it have a future? Should we point poeple to the (more advanced) Rucio dev?
    • 15:50 16:20
      Developers roundtable 30m
      • Rucio 1.21 priority followup
        • Focus
          • Cleanup & Stability
          • Documentation
          • Deployment (Kubernetes!)
        • Open ID connect #2612
          • Provisioning client used similar to query VOMS (--> probes)
          • Will need a release candidate to test with e.g. ESCAPE
        • Rucio.cfg vs Rucio config table cleanup #2630
          • Will start a document and have every component responsible to comment;
        • Documentation for configuration parameters #2631
        • History table definition explicitly (No Versioned models) #2063
        • MultiVO features #2635
        • Reaper 2.0 #2412
          • Needs improvement in the query to get list of unlocked replicas
            • SKIP LOCKED does not work due to being used as a subquery
          • Source protection: Implemented
          • Some minor other things might need improvement too
        • Operators documentation / recipes #2636
        • Expand Kubernetes usage
          • All daemons look fine
          • Servers don't look too good
            • Response time is higher than with normal server
            • Lots of pod restarts
              • Eric sees restarts sometimes too 
              • Timeouts due to DB?
            • K8s gets 2.5% of the ATLAS load now
          • CMS uses all daemons (including reaper2) as well on K8s
          • CMS instead of graphite, run statsd to prometheus
        • Tracking what happened with a did #2637
          • Hannes submitted a PR, needs review
        • XCache config table population add to probe #2638
          • Needs to be commited
        • BB8 Needs better configuration and get rid of hard-coded entries
        • Better way to deal with configuration/permissions (entry point, configuration.py, …) #533
          • Configuration comes from an external python package instead of Rucio core
          • Python package needs to have a module for schema, permission, policy
        • Transparent handling of archives with rules #1091
        • Global Quotas #2315
          • Making progress; Updating the CLI
          • PR there, needs review
        • Possibility to inject rules delayed #2639
        • Improve oracle test crashes #2588
          • Some improvements, unclear if it helped
        • Python 3.5 for server
          • pystatsd does not support Python3
            • Change to statsd possible
          • Postgres issue with byte and text data
        • Python 3.6 for clients
          • Should be compatible, but something missing in the tests
        • Changes for CTA transfer handling #2632
          • Cedric and Martin will work on this next week!
        • Source throttling #2611
          • Almost done; Need to add src_rse_id
          • Now using rse_transfer limits table
          • Will need schema change as well
          • One test failing because of sqlalchemy with sqlite; Otherwise development is done
        • Activity exclusion for submitter #2640​​​​​​​
    • 16:20 16:30
      AOB 10m