Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



Show room on map

Zoom: Ceph Zoom

    • 14:00 14:15
      CVMFS 15m
      Speakers: Enrico Bocchi (CERN) , Fabrizio Furano (CERN)


      - Migrated sft-nightlies (CVM-1938), cms-ib planned for Wed (11 Nov)
      - Fixed catalog errors on lhcbdev (CVM-1930), now ready for migration
      - CMS is also ready for migration, to be scheduled

    • 14:15 14:30
      Ceph: Operations 15m
      • Incidents, Requests, Capacity Planning 5m
        Speaker: Dan van der Ster (CERN)
        • Bernd procuring 8 racks of Ceph servers for early 2021. This will give us space to add a third block storage AZ and allow more volume backups.
        • v14.2.13 looking stable. I mirrored to our repos and will begin testing process.
      • Cluster Upgrades, Migrations 5m
        Speaker: Theofilos Mouratidis (CERN)
        •  Prometheus
          • Reduced the metrics retention rate from 15d to 10d to avoid root-full
        • Metrictank
          • Deployed with ScyllaDB instead of Apache Cassandra
          • ScyllaDB is written in C++ and it is a faster implementation of Cassandra
            • For the single node setup we would need a better performing DB
          • Deployed in CC7 because Metrictank is not released yet for EL8
          • Need to create a puppet module to open the requires ports to check the dashboards etc
        • Beesly bluestore migration progressing (CEPH-966) -- 2 disk servers to go!

        • New capacity:
          • Flax 48 disks out of 192 (16 OSDs to 1 SSD). Need to re-create other 3 hosts 12:1
          • Gabe 4 disks out of 384 (12 OSDs to 1 SSD)
          • Still missing collectd-smart-tests package on C8
      • Hardware Repairs 5m
        Speaker: Julien Collet (CERN)


        • Ceph KB reviewed and updated
        • Some beesly osds are pending (after f2b conversion seem a good time to proceed)
      • Puppet and Tools 5m
    • 14:30 14:45
      Ceph: Projects, News, Other 15m
      • Kopano/Dovecot 5m
        Speaker: Dan van der Ster (CERN)
        • Currently debugging a file locking issue: dovecot thinks a mbox file is locked, but we can't find the process that did the locking.
      • REVA/CephFS 5m
        Speaker: Theofilos Mouratidis (CERN)
    • 14:45 14:55
      S3 10m
      Speakers: Julien Collet (CERN) , Roberto Valverde Cameselle (CERN)
      • CEPH-974: gitlabartifacts bucket index shard 0 issue. I am trying to reproduce with little success so far. (Plan is to reproduce then test a cleanup procedure).


      • CEPH-993 : Test on nethub is over.
        • Plan would be to:
          • Change rgw_max_concurrent_requests to 256 on nethub, wait and see
          • Propagate to gabe (test on CVMFS first), wait and see once validated
          • Roll out everywhere when validated
        • Monitoring of 503 could be of help in this.
      • CEPH-967 : s3 quota checker script finished
        • send a daily email with the few accounts over quota
        • MR waiting for the cron
    • 14:55 15:05
      Filer/CephFS 10m
      Speakers: Dan van der Ster (CERN) , Theofilos Mouratidis (CERN)
    • 15:05 15:10
      AOB 5m