EOS is a powerful and flexible storage system, but setting up a new instance from scratch requires a solid understanding of its configuration and operational best practices. This talk will provide a step-by-step guide to deploying EOS, covering key components and essential configurations.
We will walk through the setup process, including storage provisioning, replication, erasure coding,...
Ensuring the availability of EOS instances is crucial for large-scale storage operations. To enhance monitoring and incident response, we have developed a new distributed probe designed to detect and alert operators about instance malfunctions in real-time.
This talk will introduce the architecture and functionality of the probe, which runs across multiple nodes to provide redundancy and...
For a stuck/non responsive EOS MGM, some simple diagnostic information can go a long way. We look at a new eos-diagnostic-tool for dumping stacktraces etc. for submitting useful bug reports. We also invite discussions on how to improve the tooling for the future.