Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



Show room on map
    • 14:00 14:15
      --Visitor Talk: CephFS Benchmarking for ATLAS-- 15m


      Speaker: Adam Abed Abud (Universita and INFN (IT))
    • 14:15 14:20
      CVMFS 5m
      Speaker: Enrico Bocchi (CERN)


      • Dedicated sets of squid for specific repos (merge req. pending)
      • Newer versions of squid mirrored by Linuxsoft@CERN
        • Automatic upgrade fails (conflicting files)
        • Upgrade is not transparent to our (CERN) squid module


      • Initial publication tests on atlas-nightlies with S3 backend
      • New release manager for geant4 (belle, boss, glast)


      • cvmfs-server-2.6.0 (gateway-1.0) to be tagged after testing
      • Want to achieve 1000 PUTs per second on S3
    • 14:20 14:25
      Ceph Rota Report 5m


       - beesly: down OSDs recreated.

       - gabe: 5x scsi errors (INC1934010 + 4)


       - Fixing failed disk procedure:

               - drain the disk (If ok-to-stop, stop the osd and unmount)

               - when safe-to-destroy, dd the whole disk

               - smartctl -t long on the device

               - if disk still failed, replace it, else recreate the osd.


    • 14:25 14:30
      Ceph Upstream News 5m

      Releases, Tickets, Testing, Board, ...

      Speaker: Dan van der Ster (CERN)


      • Added one node to Ceph jenkins. (Ceph Development project, r4.xlarge flavor). It seems to be running stable, will ask Bernd for more resources. (These are a type of backfill job -- when we have resources we can contribute, when not, we stop).
    • 14:30 14:35
      Ceph Backends 5m

      Upgrades, capacity changes, rebalancing, ...

      Speaker: Theofilos Mouratidis (National and Kapodistrian University of Athens (GR))

      ceph/flax: first host 75% converted

      ceph/erin: incosistent pgs repaired

      puppet5: migration should be seamless, no differences found when run `ai-catalog-diff -t devel`

    • 14:35 14:40
      S3 5m

      Ops, Use-cases (backup, DB), ...

      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))


         - gabe/S3 upgrade scheduled Tues. 12/03 (https://cern.service-now.com/service-portal/view-outage.do?n=OTG0048724)



      • Replaced old "self-managed" Elasticsearch cluster with a cluster managed by the ES team.
      • Replaced also the "ingestion" node with a docker container running in Monit team's Marathon cluster
      • Documentation updated: https://cern.ch/cephdocs/ops/s3-logging.html
    • 14:40 14:45
      Block Storage 5m

      OpenStack Cinder, Beesly, Wigner Decommissioning, ...

      Speaker: Theofilos Mouratidis (National and Kapodistrian University of Athens (GR))


      • https://its.cern.ch/jira/browse/CEPH-675
      • Wigner RBD users will probably want to use Cinder Critical (cp1, cpio1). But we have very little free space there, so should ask for more servers.
      •     NAME                ID     USED        %USED     MAX AVAIL     OBJECTS
      •     cinder-critical     75      219TiB     59.89        147TiB      57842178 
    • 14:45 14:50
      CephFS/FILER 5m
      Speaker: Dan van der Ster (CERN)


      • Peter Jones reported a possible issue with twiki-nfs01 on the weekend. We didn't find anything in the filer logs.
        • Asked about twiki CephFS migration. He will upgrade to Centos7 first, then we can migrate to CephFS.
      • HPC 1m
        Speaker: Alberto Chiusole (Universita e INFN Trieste (IT))
    • 14:50 14:55
      HyperConverged 5m
      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (Universidad de Oviedo (ES))
    • 14:55 15:00
      Monitoring 5m


       - Diskprophet set-up on a test VM

       - Installation in progress on a cepherin host (p05972678w64745)

    • 15:00 15:05
      AOB 5m