Rucio Development Meeting

Europe/Zurich
Martin Barisits (CERN)
Description
    • 15:00 15:10
      News 10m
      • Reminder: Rucio community paper for vCHEP'21
        • So far interest from
          • LDMX
          • RAL/MultiVO
          • LIGO/VIRGO
          • Belle II
          • ATLAS
        •  
      • December release schedule
        • 1.23.11.post1 yesterday
          • Small hotfix for AWS transfers
        • 1.24.0rc1 any moment
        • 1.24.0 next week
      • 1.25 "Rat-Donkey" Release discussion on Dec 17
    • 15:10 15:20
      Community News & DevOps roundtable 10m
      • ATLAS
        • Network test between CERN - PIC (NOTED project)
          • 50-60% throughput increase due to dynamically established SDNs
      • CMS
        • Transition done
          • Turned off synchronisation between Phedex and Rucio
        • Transition from CASTOR to CTA
        • Bad replicas PR would be useful to have it in 1.23.11.post2
      • Fermilab
        • Testing new users on DUNE production instance
      • Belle II
        • Preparing migration for mid-January
        • Belle-Rucio talk at vCHEP
          • Will also contribute to community paper
        • DIRAC
          • REST Interface or Rucio Python Clients
            • Do not want to manage rucio.cfg
            • Instead pass arguments to the baseclient
      • LDMX
      • RAL/MultiVO
        • Interest from EGI to use MultiVO instance
        • conveyor-submitter setup needs to be checked
      • ESCAPE
        • New rule algorithm for Judge-Injector activated
          • No issues with memory
          • Injection of 1Mio file rules worked
          • Issue with repairing this large rule with judge-repairer
            • Solved by increasing memory of repairer
            • Would be good to have new algorithm also included for the judge-repairer
        • Restart for judge-evaluator
          • Possibly giant rule?
        • Server restarts still happening sometimes but not reaching memory limit, still restarting though
          • Error reason is OOMKilled
    • 15:20 15:30
      Hot topics 10m
    • 15:30 15:55
      Developers roundtable 25m
      • 1.24 "Aqua Donkey" Priority followup
        • In Progress
          • Handling of Archives in the Reaper #1431 [Martin]
          • QoS #3419 [Mario, Martin]
            • New ESCAPE fellow will start to work on Rucio soon
            • Further developments delayed to 1.25
          • Distance based sorting of replicas #4020 [Nicolo]
            • Will be based on Bens code 
        • To Do
          • Reduce Clients tickets [Mario, Tomas]
          • rucio.cfg vs config table #2630 [Mario]
            • SQLAlchemy first :-)
        • Done
          • Reduce Core & Internal tickets [Mario, Gabriele, Martin]
          • Documentation Upgrade [Divya]
          • Remove webpy endpoints and dependency #4044 [Ben, Thomas]
            • Done as far as 1.24 is involved
          • Upgrade SQL Alchemy version #4055 [Mario, Martin]
          • Rewrite Conveyor Throttler #4056 [Ben]
          • Set geoip as default sorting algorithm #4017 [Ben]
      • End to End integration tests [Mayank]
        • PR for xrootd uplaod & download and protocol tests already there
        • Building rucio-dev container
        • Now looking into TPC testing
          • Will be a separate PR
      • Logging discussion
        • In ATLAS everything is logged (Including DEBUG messages)
          • Creates a lot of overhead (E.g. Hermes logs every message)
          • Switched to INFO level
        • Martin will create ticket for component leads to go through their components and ensure consistency of loglevel & logmessages
      • Documentation
        • https://rucio.cern.ch/documentation/
        • https://github.com/NiklasRosenstein/pydoc-markdown
          • Martin will try to integrate
      • Tests
        • Parallel run of the tests
          • Ben already looking into this
          • Possibility to run some tests sequential?
          • Tasks to adapt tests to parallel execution will be on Comp. Leads
        • In memory database
          • Could also speed up tests a bit
          • Could activate right now
    • 15:55 16:00
      AOB 5m