Speaker
Description
Safespring are currently building a new version of our ceph service, and in this session, I will give an overview of the architecture of the load balancer for our S3 service, which we are building using open source tools. In our design, we use internet as a design pattern for the datacenter cluster network ensure scalability and predictable performance. That means we use BGP everywhere, even for the last hop to each server node. In the talk, I will outline how we use BGP per-flow ECMP routing to achieve redundancy and high throughput for the frontend services, BFD for failure detection, we use traefik as load balancer, and Prometheus and Grafana for monitoring.
This is a work in progress, but I will give an overview of our design decisions, and experiences so far.