Ceph/CVMFS/Filer Service Meeting

Europe/Zurich
600/R-001 (CERN)

600/R-001

CERN

15
Show room on map
    • 14:00 14:05
      CVMFS 5m
      Speaker: Enrico Bocchi (CERN)
    • 14:05 14:10
      Ceph Upstream News 5m

      Releases, Tickets, Testing, Board, ...

      Speaker: Dan van der Ster (CERN)

      Dan:

    • 14:10 14:15
      Ceph Backends & Block Storage 5m

      Cluster upgrades, capacity changes, rebalancing, ...
      News from OpenStack block storage.

      Speaker: Theofilos Mouratidis (National and Kapodistrian University of Athens (GR))

      Teo/Dan:

      • flax bluestore conversion ongoing. Now using a faster technique (stop, zap OSDs, don't wait to drain before recreating as bluestore).
      • One osd showing very high usage even though it has few PGs.

      Dan: 

    • 14:15 14:20
      Ceph Disk Management 5m

      OSD Replacements, Liaison with CF, Failure Predictions

      Speaker: Julien Collet (CERN)

      Julien:

      • Couple of disk replacements in beesly (ops guide update?), osds are still to be recreated. 
      • Couple of disk failing in erin:
        • p05972678u44402/sdv: disk is failing but prophetstore doesn't seem to think it should be changed now...
      • Presentation to repair-service on new disk replacement procedures:
        • Goal is to offload disk replacement procedures to them
        • FDO to provide helper scripts that facilitates the procedure
        • Paul and Remy will be testing the proof testing the scripts

       

    • 14:20 14:25
      S3 5m

      Ops, Use-cases (backup, DB), ...

      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))

      Julien/Roberto:

      • Set up cosbench to measure actual S3 performance
      • In the process of setting up a couple of VMs that we remain only for benchmarking purposes
        • e.g. Evaluation of performance variation post updates

      Dan:

      • reshard rm-stale-instances finished on cephgabe0. It removed ~10 million index objects -- still 13million leftover. 
    • 14:25 14:30
      CephFS/Manila/FILER 5m

      Filer Migration, CephFS/Manila status and plans.

      Speaker: Dan van der Ster (CERN)

      Dan:

      • With ceph/kelly on mimic, I started testing more the cephfs snapshots. There are a few options how to integrate with Manila:
        • Manila has a snapshot feature -- users can create a snapshot from the manila API.
        • We could create "ZFS-like" auto snapshots on all manila volumes (e.g. hourly, daily, weekly, ...)
        • To be tested: how would user created snapshots interact with auto snaps
        • To be tested: at which level should we autosnap? Cephfs-wide, or individually for all volumes.
    • 14:30 14:35
      HPC 5m

      Performance testing, HPC storage status and plans

      Speakers: Alberto Chiusole (Universita e INFN Trieste (IT)), Pablo Llopis Sanmillan (CERN)

      Benchmark results on CEPH /bescratch @cern: https://gistpreview.github.io/?a8fbb37b6d07f841297fcce9500ccdbe

    • 14:35 14:40
      HyperConverged 5m
      Speakers: Jose Castro Leon (CERN), Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))

      Jose:

      • Tunables seem to not have an effect on performance, shall I put the default flags everywhere in Cinder?
      • Fast-diff can be enabled then later on any old image, it may require an object rebuild later
      • Preliminary tests with client caches didn't spot any performance gains (weird)
        • Julien will check different configurations of the cache more extensively
        • In case the client cache does not make a difference, we could try to increase the OSD memory limit
    • 14:40 14:45
      Monitoring 5m

      Julien:

      • Prophetstore monitoring spans now all the EC rows of erin.
      • Trial period extended

      Question: the prometheus alerts seem to flap on and off? Is there some config to fix this?

      Meeting after meeting for KPI/SLI dashboard with L. Magnoni.

      Roberto:

      • Repeating Firing/Resolved alerts on cepherin due to active mgr flapping. (will enable/disable prometheus module to see if helps)
    • 14:45 14:50
      AOB 5m
Your browser is out of date!

If you are using Internet Explorer, please use Firefox, Chrome or Edge instead.

Otherwise, please update your browser to the latest version to use Indico without problems.

×