Showing problems since Saturday -- 2/6 publishers still work
Might be a lease database problem on the gateway
Fixed at around 14:30 -- Root cause is still under investigation
Ceph Operations Reports20m
Teo (cta, kelly)5m
production CTA "object store" migrated successfully to cephcta from cephkelly.
This involved a data migration -- originally planning to use rados export | rados import, however there was no way to limit to a few namespaces. So Teo wrote a simple export/import tool custom for CTA objects.
There is still some dev/ci activity on cephkelly -- it will be migrated to dwight.
Enrico (barn, beesly, gabe, meredith, nethub, ryan, vault)5m
Would be nice to confirm impact on AFS -- haven't heard back yet
New capacity in HA08 installed, public IPs, filled with data
Same trick of splitting rack into two (or weight would be too high)
Cluster is now ~60% full (rgw.buckets.data)
(Re-)starting draining of HA07 and HA06
Should fit into new HW -- tbc
Will continue over xmas (slowly) to replace HW very beginning 2022
Selinux issue fixed by reinstalling pcp-selinux (and reboot)
Dan (dwight, flax, kopano, jim, upstream)5m
Dan van der Ster
dwight/flax/kopano/jim : no upgrades planned until new year.
flax: 4 new physical MDSs added. LinuxSoft was reporting slowness after the switch failure 2 weeks ago. Rather than failover rank 0 to a new MDS, I added activated a 4th rank and pinned LinuxSoft shares to it.
New dashboard to view a particular tenants activity
More extensive tenant information
Top n or avg or both?
It's a bit slow
Making sense of the data
What would be useful to display
Verify data accuracy
EOS CephFS Test5m
Roberto Valverde Cameselle
Still working on rbd-mirror
Discussion with the network team
Need to have some servers (rgw+route server or reflectors) that they can plug into their lab
8 servers or vm total to fully test that
Will prepare the configuration needed int the meantime and once we have the servers ready we will ping them back to test a pilot
PreCC proof-of-concept in 2022 -- documents shared. Tim estimated 5d to deploy the clusters and 15d to setup multi-region. (Not including any r&d work, not including maglev, not any subsequent testing -- just the "setup" time).
Total is 100 person-days to setup the Indico DR/BC PoC.