Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



    • 14:00 14:05
      CVMFS 5m
      Speaker: Enrico Bocchi (CERN)
      • Debugging // profiling atlas-nightlies.cern.ch (S3 + Gateway):
        • Long transaction: 937 packages, 17 GB
        • Many packages built // patched on the RM (ceph-fuse heavily spinning)
        • After 2h15, 648 pkgs out of 937 --> publishing not even started but transaction open
        • Lease time: 6h
      • Restarting lhcbdev-test on S3
      • Buggy cvmfs_server check with S3 storage (does not honor "429 Too Many Requests"): CVM-1736
      • Merge request for dedicated atlas squid proxies
    • 14:05 14:10
      Ceph Upstream News 5m

      Releases, Tickets, Testing, Board, ...

      Speaker: Dan van der Ster (CERN)
    • 14:10 14:15
      Ceph Backends & Block Storage 5m

      Cluster upgrades, capacity changes, rebalancing, ...
      News from OpenStack block storage.

      Speaker: Theofilos Mouratidis (National and Kapodistrian University of Athens (GR))


      • Setup devstack to test automatic manila snapshots
      • Helped Julien with lemon->collectd
      • Backfill optimization, almost done except:
        • Figure out how to fill each PG tree, now only primary
        • Figure out a persist scheme (interval, number of updates)
    • 14:15 14:20
      Ceph Disk Management 5m

      OSD Replacements, Liaison with CF, Failure Predictions

      Speaker: Julien Collet (CERN)


      • IT-CF successfully carried out their first disk replacement autonomously after last week's presentation
      • They will be fully operational within 2w
    • 14:20 14:25
      S3 5m

      Ops, Use-cases (backup, DB), ...

      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))


      • Nexus buckets are not deletable by end-users. INC1983576
        • CDA has configured the Nexus OpenShift app to enable object versioning, so even though users see a empty bucket, there are still thousands of hidden objects.
        • We can bucket rm --purge-objects, and I've asked Alex Lossent if Nexus has a better cleanup tool.


      • RGW VMs replaced by larger ones, now all in prod
      • Investigating an issue (apparently unable to retrieve bucket contents when over than 1000 objects)
      • CEPH-697: still in progress - deadline?
    • 14:25 14:30
      CephFS/HPC/FILER/Manila 5m

      Filer Migration, CephFS/Manila, HPC status and plans.

      Speakers: Dan van der Ster (CERN), Pablo Llopis Sanmillan (CERN)


    • 14:30 14:35
      HyperConverged 5m
      Speakers: Jose Castro Leon (CERN), Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))
      • Several Kopano load tests on the kelly cluster last 2 weeks. Even with highest load we only see ~5MBps, likely indicating the client cache on the Kopano server is effective.
      • Numbers estimate:
        • ~700 million mails in Exchange, roughly 2/3rds have an attachment.
        • Kopano shards to 200 attachment dirs, and we'll have 10 Kopano servers.
        • 500 million / 200 / 10 = 250,000 attachments per directory.
    • 14:35 14:40
      Monitoring 5m


      • Updated prometheus + alertmanager + blackbox exporter
    • 14:40 14:45
      AOB 5m
