Dr Andrew Stephen McGough (Imperial College London)
The Grid as an environment for large-scale job execution is now moving beyond the prototyping phase to real deployments on national and international scales providing real computational cycles to application scientists. As the Grid move into production, characteristics about how users are exploiting the resources and how the resources are coping with production load are essential in understanding how the Grid is working and how it should be modified in order to better meet the needs of the growing user communities. Such characteristics as user submission patterns, average job execution times – for different communities which are organized into Virtual Organizations (VOs), the number of active members within a VO along with how the Grid infrastructure is coping with this load – both at a resource and middleware level – are vital in order to judge what are the critical bottlenecks that need overcoming to ensure continued success. In order to better understand these characteristics a full analysis of the Grid is essential. Through related work with the EGEE Real time Monitor (RTM) we have been able to collect trace logs for over 52 million job executions since September 2005. The RTM provides a near-real time graphical view of the status of the EGEE Grid requiring privileged access to the databases within EGEE. By recording this information and post processing we are able to determine the life cycle of each job submitted through the EGEE. In this paper we analyze these trace logs to determine these characteristics.