Dr Alessandra Doria (INFN Napoli)
The large potential storage and computing power available in the modern grid and data centre infrastructures enable the development of the next generation grid-based computing paradigm, in which a large number of clusters are interconnected through high speed networks. Each cluster is composed of several or often hundreds of computers and devices each with its own specific role in the grid. In such a distributed environment, it is of critical importance to ensure and preserve the functioning of the data centre. It is therefore essential to have a management and fault recovery system that preserves the integrity of the systems both in presence of serious faults such as power outages or temperature peaks and in maintenance operations. In such a context, for the ATLAS INFN Napoli Tier2 and for the SCoPE project of the University “Federico II” of Napoli, we developed Powerfarm, a customizable thread-based software system that monitors several parameters such as, for example, the status of power supplies, room and CPU temperatures and promptly responds to values out of range with the appropriate actions. Powerfarm enforces hardware and software dependencies between devices and is able to switch them on/off in the particular order induced by the dependencies. Indeed, Powerfarm makes use of specific parametric plugins in order to manage virtually any kind of devices and represents the whole structure by means of XML configuration files. In this optic, Powerfarm may become an indispensable tool for power and emergency management of the modern grid and data centre infrastructures.
|Presentation type (oral | poster)||poster|