Speaker
Mr
Martin Bly
(STFC/RAL)
Description
In November 2012 the RAL Tier1 suffered two serious power failure incidents in a period of two weeks. The first was a general whole-site failure which was exacerbated by the loss of the UPS power supply, and the second was a power surge caused during work on the UPS supply feed and which damaged significant amounts of equipment. In both cases the Tier1 facility and other Scientific Computing facilities were off-line for extended periods of time. This presentation runs through the incidents and the lessons learned while restoring the services.
Primary author
Mr
Martin Bly
(STFC/RAL)