Speaker
Stefano Dal Pra
(INFN)
Description
On November 9 2017, a major flooding occurred in the computing rooms: this has turned into a down of all the services for a prolonged period of time.
In this talk we will go through all the issues we faced in order to recover the services in the quickest and most efficient way; we will analyze in detail the incident and all the steps made to recover the computing rooms, electrical power, network, storage and farming.
Moreover, we will discuss the hidden dependencies among services discovered during the recovery of the systems and will detail how we solved them.
Desired length | 15 |
---|