Ceph/CVMFS/Filer Service Meeting

600/R-001 (CERN)



Show room on map
    • 14:00 14:15
      CVMFS 15m
      Speaker: Enrico Bocchi (CERN)

      Enrico (Greetings from Copenhagen):

      • lhcb Stratum0 volume affected by network outage on ceph
        • Repo was in a transaction
        • Needed to abort (and reboot) once ceph was back
      • New 'cms-ci' repo on S3. Delivered.
      • Faser experiment asked 3 new repos (RQF1510252). Discussion with user ongoing.
      • 6 frontier-squids in prod, two VMs per AZ, 640GB x 6 VMs, 2 workers per VM
        • 3 old squids to be retired
          • ca05. I remember seeing this explicit in some client config and in docs
          • Will ask the network team who is querying this before deleting the VM
        • 3 dedicated squids for atlas to be upgraded to frontier
          • won't use big machines here


    • 14:15 14:30
      Ceph: Operations 15m
      • Notable Incidents or Requests 5m

        Dan and Julien

        • Thursday's Ceph outage (OTG0054393):
          • Switch rebooted around 3:50 and faulty line card (OTG0054392) resulting in the block storage/cephfs clusters being down
          • Daemons were restarted and once the faulty line card was fixed, the issue was resolved around 9h
          • Router issue is "ECMP" with a flaky router. See https://xkcd.com/2259/
      • Repair Service Liaison 5m
        Speaker: Julien Collet (CERN)
      • Backend Cluster Maintenance 5m
        Speaker: Theofilos Mouratidis (CERN)


        • cephcta small cluster upgraded to v14.2.6. No issues.
    • 14:30 14:45
      Ceph: Projects, News, Other 15m
      • Backup 5m
        Speaker: Roberto Valverde Cameselle (CERN)


        • Backup issue of last week seems solved, reason unclear.
        • Starting to add big users to the backup. 550T S3 storage used now, 17.6K users enabled, 2.8k pending. [grafana]
      • HPC 5m
        Speaker: Dan van der Ster (CERN)
      • Kopano 5m
        Speaker: Dan van der Ster (CERN)
      • Upstream News 5m
        Speaker: Dan van der Ster (CERN)
    • 14:45 14:55
      S3 10m
      Speakers: Julien Collet (CERN), Roberto Valverde Cameselle (CERN)
    • 14:55 15:05
      Filer/CephFS 10m
      Speakers: Dan van der Ster (CERN), Theofilos Mouratidis (CERN)
    • 15:05 15:10
      AOB 5m


      • Ceph SUG Call (https://pad.ceph.com/p/Ceph_Science_User_Group_20200122), in brief:
        • Globally, Nautilus feedback is really good (at least for those running 14.2.6)
        • A lot of users plan to upgrade from Luminous to Nautilus straight (à la cephgabe), our friends from SKA are planning to do it soon on their rgw cluster.
        • (recording to be relased soon)