We are approaching the integration of applications for monitoring, realizing a portal-integration (PI) and an enterprise-information-integration (EII).
The work we have done has followed this path: (i) identification of the functionalities that need to be monitored, along with their characteristics; (ii) identification of the already available commercial or self-made software products, if any; (iii) identification of the available open–source software; and (iv) development of dedicated plug -in and java web – applications.
The work has been carried out because of the need of having a dedicated troubleshooting and monitoring system for the newly built Data Center (we opened on february 2009), for which we needed a robust, flexible and easily adaptable system, without the complexity of functionalities often present in commercial systems but of rare usage, if any. But soon the job moved to the construction of a true portal system for the Data Center, which shall soon become the unique point-of-entry for all kinds of access, from the novice user, the expert user, the management team, and the referee group which has its role in future funding.
The system we realized represents an integrated system, giving to most users a unique approach in accessing the grid Data Center; in particular, for the management team, it allows a thorough integration of all monitoring subsystems, and thus able to easily accommodate new hardware and software tools, thus following the evolution of the Data Center. For first-level monitoring, the approach we followed guarantees a simplified view of the infrastructure, so that operators shifts do not need very qualified personnel. We realized our own modules for a lot of equipment, but we also integrated existing applications such as GRIDICE – GANGLIA – POWER FARM. Such an integration strategy is in our opinion very useful in a situation, like the one in EGEE, in which a single tool is not enough for an exhaustive management of all the aspects of the Data Center.
Conclusions and Future Work
More modules and more functionalities are being implemented, as well as an optimization of existing ones, initially integrated off-the-shelf, e.g. by using struts as a development framework for java web application. As an example of a tool which we are implementing, we are actively working on a tool for automatic emergency shutdown and restart, and for programmed shutdown and restart for maintenance downtimes, on a rack basis.
|URL for further information||www.scope.unina.it|
|Keywords||monitoring - grid - data center|