Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



Show room on map

Zoom: Ceph Zoom

    • 1
      Ceph: Operations Reports
      • a) Teo (cta, erin, kelly, levinson)
        Speaker: Theofilos Mouratidis (CERN)
        • New JIRAs:
          • All the CTA clients need to be upgraded to 14.2.20 (CEPH-1143).
          • upgrade ceph/cta nodes to Centos 8  CEPH-1142
      • b) Enrico (barn, beesly, gabe, meredith, nethub, vault)
        Speaker: Enrico Bocchi (CERN)
        • All clusters except gabe upgraded to 14.2.20-2 (builds shown here: CEPH-1141)
          • Fix for progress bar bug.
      • c) Dan (dwight, flax, kopano, jim)
        Speaker: Dan van der Ster (CERN)
        • flax: readonly cephfs client key created and deployed for restic: CEPH-1144
        • flax: k8s user hammering to mount a share that doesn't exist: CEPH-1146
          • shows up as high cpu usage and increased latency on the clients. 
          • User has rebooted their node and load is back to normal.
        • All clusters:
          • In preparation for octopus upgrades I will write a tool to let us sample OSDs for zombie spanning blobs (CEPH-1145)
            • These are bluestore unreferenced blobs which can leak, leading eventually to "no blob id" aborts if we accumulate too many.
      • d) Arthur
        Speaker: Arthur Outhenin-Chalandre (CERN)
    • 2
      Ceph: Operations Tools (ceph-scripts, puppet, monitoring, etc...)
    • 3
      Ceph: R&D Projects Reports
      • a) Reva/CephFS
        Speaker: Theofilos Mouratidis (CERN)
      • b) Disaster Recovery
        Speaker: Arthur Outhenin-Chalandre (CERN)
        • Added more disk into my SSD pools
          • raw pool performance in 4M is a little bit better compared to HDD this time
          • journaling performance is similar to previous test
        • My snapshot fix is merged to master upstream
        • I fixed some issues with my OpenStack patches, the CI should pass this time, but no review from maintainers yet
        • Starting to build some slides for a possible ceph lightning talk
    • 4
      Ceph: Upstream News
    • 5
      Speakers: Enrico Bocchi (CERN) , Fabrizio Furano (CERN)

      Pretty calm


      Small surprise seeing that the scrubbing following a disk substitution on the backup machine took 15 days and one minute. That's quite an important delay

      Created the repo for sndlhc RQF1792936,

      Some discussion with Enrico about another repo request that Enrico sniffed as "not what they want" RQF1795268

      Nighttime snapshot hiccups are much less than they used to be, yet I (Fab) can still some, e.g. today at 5.30. These are all self-recovering hiccups, yet would be nice not to have this noise

      Discussion (mainly with Dave Dykstra) about miss rates on the squids serving frontier and cvmfs (sometimes getting as bas as 50-60%). He proposes a parm change. No problem with this, yet the question is how much scalability space this subsystem has; this is relatively simple to predict, much more difficult (means unknown) to quantify and justify

    • 6