System Analysis Working Group

Julia Andreeva (CERN)
Discuss the madate and organization of work
Attendees : Benjamin,Diana,Iosif,Roberto,Ian,Gianluca,Julia,Dietrich and Stefano (via phone) Julia presented an overview of the goals of the group and suggestions of the organization of work Roberto pointed out that is though experiments which have centrally organized work load management systems like LHCb and Alice and have a good monitoring systems for application monitoring coupled with their submission tools like Dirac, they are still interested in the VO view on the Grid infrastructure, in particular in Grid related failures. Discussion: Julia asked what would be the first priority item to revise. Suggested by Diana and Roberto and supported by the others to make the overview of how experiments currently deal with application monitoring, encountered problems, how they communicate these problems to the site admins... It was agreed that during next meeting we will have presentations from LHCb (Roberto) and ATLAS (Benjamin), and next one from Alice and CMS. There was a discussion how the expertise often incapsulated inside VO (Diana) and human experience in solving problems (Iosif) can be used for improving of the support of the infrastructure. Diana pointed out that people working on the Grid support has not much knowledge about VO internals. Whether we can define the way the information about problems would be propagated back to the sites to people who have to take actions. Ian suggested that it is done the same way as is planned for publishing of the grid services status information i.e. in the local fabric monitoring systems. Julia mentioned that monitoring tool should not only indicate a problem , but also provide a possible troubleshooting recipe, and that collecting and finding this information is not an easy task. Diana pointed to troubleshooting links provided by Goc. Next meeting is planned for the next week, need to define the time for regular meetings. Julia will ask people about time.
