CVMFS Sync Meeting

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map
Participants
  • Fabrizio Furano
  • Hugo Gonzalez Labrador
  • Maria Arsuaga Rios
  • Valentin Volkl
Zoom Meeting ID
63671903080
Host
Maria Arsuaga Rios
Useful links
Join via phone
Zoom URL
    • 11:00 11:10
      Operational status & service overview 10m
      • Current CVMFS service health (repositories, stratum-0/1, monitoring)
      • Any incidents since last meeting
      • Short-term operational risks or alerts

      Hypervisor issue yesterday, the site caches were affected and survived, maybe just with some performance degradation that nobody noticed.

      Alarms, fault tolerance, telegram, all worked beautifully apparenty. The system came back to 100% without any intervention needed. I'll doublecheck anyway, as it's Friday and it seems even too good to be true.

      Another issue popped up around 21h00 yesterday 19/03. This was affecting the acron job that runs the whitelists and hostcount checks. It solved by itself around 22h. Cause: unknown, maybe the same network/hypervisor issue as the previous points.

      Apparently I am not able to prevent 

      cvmfsdata20-4455801563

      from swapping. This causes alarms, as those machines are not supposed to swap. I set the vm.swappiness to 0 but it does not seem to work. What can I do?

       

    • 11:10 11:20
      Issues & interventions follow-up 10m
      • Review of open issues
      • Status of ongoing or planned interventions
      • Decisions needed / actions agreed
    • 11:20 11:30
      Onboarding & knowledge transfer 10m
      • Onboarding progress for Hugo and Maria

      • Open questions or unclear areas

      • Documentation gaps or runbooks to improve

      • Identify topics needing deeper walkthroughs (future sessions)

      Maria is doing big progress with her personal exercise repository, yet it still does not work. The monitoring is throwing warnings:

      Traceback (most recent call last):
        File "/usr/local/bin/cvmfs-stats3.py", line 352, in <module>
          cs.get_repo_stats()
        File "/usr/local/bin/cvmfs-stats3.py", line 142, in get_repo_stats
          status, rh = run_command(cmd_root_hash, [0])
        File "/usr/local/bin/cvmfs-stats3.py", line 37, in run_command
          raise Exception("error %s running %s" % (status, cmd))
      Exception: error 1 running cvmfs_swissknife info -r http://cvmfs-marsuagacvmfs.s3.cern.ch/cvmfs/marsuagacvmfs.cern.ch -c

       

      ... is everybody correctly receiving cvmfs-botmail@cern.ch ?

       


      Let's encourage Hugo to continue the exercise and fix a couple of simple real tickets, just to help understanding some corners of the system

      RQF3555480

      RQF3639647

      RQF3619820

    • 11:30 11:40
      Collaborator input – SFT / Valentin 10m
      • Updates from SFT side

      • Cross-team dependencies or changes impacting CVMFS

      • Technical discussions or requests for support

      • Upcoming developments worth tracking

      Progress towards the harmonization of the release managers?

      Nothing to report. Current priorities are the 2.14 release and unpacked.cern.ch

      2.14 will have the fix for the aborting a transaction when the disk is full(See Ticket opened by ALICE this week): cvmfs/cvmfs#4158

      And fix for logrotate config: cvmfs/cvmfs#4163

    • 11:40 11:45
      Planning & priorities 5m
      • Short-term priorities until next sync

      • Upcoming milestones or expected changes

      • Topics to follow up

    • 11:45 11:50
      AOB 5m