Danilo Dongiovanni (INFN-CNAF, IGI)
In production Grid infrastructures deploying EMI (European Middleware Initiative) middleware release, the Workload Management System (WMS) is the service responsible for the distribution of user tasks to the remote computing resources. Monitoring the reliability of this service, the job lifecycle and the workflow pattern generated by different user communities is an important and challenging activity. Initially designed to monitor and manage a distributed cluster of gLite WMS/LB (Logging and Bookeeping) services, WMSMonitor has proved to be a useful and flexible tool for a variety of user categories. In fact, after asynchronously extracting information from all monitored instances, WMSMonitor re-aggregates it by different keys (WMS instance, Virtual Organization, User, etc.) providing insight both on services status and on their usage to service administrators, developers, advanced Grid users and performance testers. The positive feedback on WMSMonitor utilization from various production Grid sites pushed us to improve the tool to enhance its flexibility and scalability exploiting a new architecture. Moreover the tool has been made compliant to recent evolutions in the monitored services. We therefore present the new version of WMSMonitor which can monitor EMI WMS/LB services and shows an improved user interface allowing better report capabilities. Among main novelties, we mention the collection of Job Submission Service (JSS) error type statistics and the adoption of ActiveMQ messaging system which now allows multiple data consumers to exploit collected information. Finally, it is worth to mention that the implemented architecture and the exploitation of a messaging layer commonly adopted in EMI Grid applications make WMSMonitor a flexible tool that can be easily extended to monitor other Grid services.