Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



Show room on map
    • 14:00 14:05
      CVMFS 5m
      Speaker: Enrico Bocchi (CERN)
      • cms-ib release manager upgraded to CC7
        • Issue with deletion of old folders with hard-links (checking with developers)
      • Measuring time to setup atlas environment on lxplus nodes
      • Signing whitelist with YubiKey for boss.cern.ch, today @3pm
    • 14:05 14:10
      Ceph Upstream News 5m

      Releases, Tickets, Testing, Board, ...

      Speaker: Dan van der Ster (CERN)
    • 14:10 14:15
      Ceph Backends & Block Storage 5m

      Cluster upgrades, capacity changes, rebalancing, ...
      News from OpenStack block storage.

      Speaker: Theofilos Mouratidis (National and Kapodistrian University of Athens (GR))


      • SSD endurance resolved
        • Le plot
        • Plots are up, also the data are shown in life percent order
        • Created grafana alarm that sends a ticket to ceph-admins
          once an SSD is lower than 30% for a week
      • legacy bluestore stats are now fixed
      • old osd maps stopped being trimmed, about 500 are kept
    • 14:15 14:20
      Ceph Disk Management 5m

      OSD Replacements, Liaison with CF, Failure Predictions

      Speaker: Julien Collet (CERN)


      • Some remaining manual disk replacements
      • Started working on integrating new nautilus features to existing repair-team scripts
      • Double inconsistent PG due to broken disks on beesly
    • 14:20 14:25
      S3 5m

      Ops, Use-cases (backup, DB), ...

      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))


      • Lots of vm_kill for radosgw machines.


      • Documentation added for the `aws-cli` tool
      • Accounting script almost ready using Jose's Services account
        • This account will also be re-used for the next rbd-top iteration


    • 14:25 14:30
      CephFS/HPC/FILER/Manila 5m

      Filer Migration, CephFS/Manila, HPC status and plans.

      Speakers: Dan van der Ster (CERN), Pablo Llopis Sanmillan (CERN)
    • 14:30 14:35
      HyperConverged 5m
      Speakers: Jose Castro Leon (CERN), Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))
    • 14:35 14:40
      Monitoring 5m


      • S3 bucket stats again broken. I will probably ban getting bucket stats for the user `rvalverd`.
      • Investigation about duplicated prometheus tickets still ongoing. Increased the debug of prometheus+alertmanager and found that alertmanager is receiving duplicated alerts, so the problem should be on prometheus server or ceph prometheus module.
    • 14:40 14:45
      AOB 5m