TMB Meeting

Europe/Zurich
600/R-002 (CERN)

600/R-002

CERN

15
Show room on map
Steven Newhouse
Description
Dial-in numbers: +41227676000 (English, Main) Access codes: 0173115 (Leader) 0183088 (Participant) Leader site: https://audioconf.cern.ch/call/0173115 Participant site: https://audioconf.cern.ch/call/0183088 Task List: https://savannah.cern.ch/task/?group=tcg
TMB - 2009-08-05 Present: Steven Newhouse, Antonio Retico, Oliver Keeble, Andrea Sciaba', Francesco Giacomini, Frank Harris, Massimo Lamanna Phone: Vangelis Floros, Andrei Tsaregorodtsev, Guenter Grein, Dennis Van Dok Minutes of last meeting ======================= No comments. Approved. Task Review =========== Task #9916 (Problem with auto-publishing of space reservation into Info System) Oliver: a solution is available, it will be in a next DPM patch Task #9915 (More details needed on firewall configuration issue) The last comment mentions a document with a list of open ports. Vangelis will pass this reference to Gergely and see if it satisfies his needs. Task #8953 (Clarify the restriction mechanism for multiple service instances to individual VOs) Need a clarification from someone in the AuthZ Service group. Francesco will contact Christoph and/or Chad. Task #8326 (Error codes for the command line interfaces) Task #7932 (Lack of APIs for various middleware services/components) Task #6711 (Discuss standards for error messages) Task #6712 (Handling of bugs regarding error messages) Francesco will provide a statement on all these tasks, based on the work planned in the second year of EGEE-III. Task #7938 (Portal access to the infrastructure) Task #6901 (Storage semantics issues for EGEE (beyond HEP)) Vangelis has posted a comment with use cases. Task #6652 (Check if the WNRWG will make recommendations about how to characterise a subcluster) An info provide is still needed. Steven: can Laurence do it? Oliver: not necessarily, info providers are maintained by several people; but in general there is no documentation on what is mandatory and should be enforced. Task #6649 (Local scratch space and shared storage in Glue) Oliver will ask Laurence to have a look at it. Task #5952 (Python bindings of the LCG UI) Oliver to check the status. MPI - the next phase ==================== Oliver: a patch (https://savannah.cern.ch/patch/?3092) has been opened where all the outstanding MPI-related bugs will be attached. The bugs mainly concern configuration issues. It is estimated that it will take ~2 months to fix, plus certification; a partner for the certification has already been identified. The MPI libraries and utilities will be rebuild and made available. Dennis: it's also important to be able to pass additional parameters down to the batch system and the ability to specify additional attributes in the JDL Frank mentions the documents circulated by Vangelis: the comments from NA4 partners on the proposal by the MPI WG and the experience on using MPI on the current EGEE infrastructure. Vangelis: what are the priorities? Steven: first the fixes mentioned by Oliver, then prototyping the proposed changes to the JDL Francesco: what about publishing the interconnect type of a site? who does that and when? Dennis will check if this specification already exists Francesco, Oliver, Franck, Vangelis and other interested people from NA4 should be included in the MPI mailing list Steven: what about the SAM tests? Oliver: they need to be revived, then we enable them after the MPI fixes are available Antonio: after the changes are available we should start a pilot service, involving the interested VOs; we could then tune both the installation and the SAM tests. Steven: there is a two-hours session on MPI at EGEE'09, on Tuesday morning; it will cover updates on the MPI WG, on the patch, on which sites to involve in the pilot, on how thing are evolving from the applications point-of-view, e.g. for JDL changes. Relevant people from JRA1, SA3 should be there. Oliver: it would be useful to have a presentation on how to enable MPI on a site, but this may well go into another SA1 session, where sites are more represented. Steven: if communities find problems they should raise GGUS tickets, e.g. for wrong published tags Bug Classification ================== Integrating GGUS and Savannah ============================= Francesco presents the key points of the proposal on "Problem Management and Change Management in gLite", i.e. how to handle "bugs". There is agreement to apply immediately the parts concerning the classification based on Severity and Priority, with their consequences in terms of release management. Further discussion is needed on the following points: - Should submission to Savannah be restricted only to gLite people? This would prevent people without a GGUS account to interact with gLite maintainers and is not considered for the moment a good move. On the other hand if problems found by users in production are not all registered in GGUS, it becomes very difficult to compute meaningful user-oriente metrics, which are strongly requested by the project reviewers. For the moment everybody is still allowed to submit directly to Savannah, but if the submitter is a user, he/she is requested to submit also a GGUS ticket, to be linked with the Savannah bug. Francesco and Antonio to provide an estimate of bugs submitted by users directly into Savannah. - For the high-priority fixes, consider their impact on the staged rollout process. - There is no user represantative in the EMT, which is the body that should decide on the priority of changes. This for the moment doesn't seem to be a big problem, because bug submitters are usually aware of the impact of that bug on the affected users. Moreover SA1 is present at the meeting and to some extent can provide an infrastructure (i.e. user) perspective. - What is the relation with GSVG, managing security vulnerabilities? The proposal will be extended to cover also that. - The priority of a patch should be related to the priority of the attached bugs. - The proposal should cover rollback of unsuccessful changes. - The proposal does not currently cover the interaction between GGUS and Savannah. There is agreeement that the way the state of the GGUS ticket changes based on state changes of the corresponding Savannah bug needs to be reviewed. In particular a GGUS ticket cannot be closed until the Savannah bug has been closed. AOB === Francesco: what is the support to be provided for services/components on SLC4? Three options: 1. SLC4 is not supported any more; 2. SLC4 is in pure maintenance mode, only critical fixes are applied; 3. SLC4 is fully supported Each option has certain practical consequences. Francesco and Oliver to prepare a proposal.
There are minutes attached to this event. Show them.
    • 10:30 11:00
      Approval of Minutes and Task Review 30m
    • 11:00 11:30
      MPI - the next phase 30m
      Building on the technical work of the MPI-WG - what are the user and site issues that might still face us?
    • 11:30 12:00
      Bug Classification 30m
      For this TMB I would like to put on the agenda an item concerning the way bugs are managed, from classification (critical, major, ...) to the release of the corresponding fixes.
      Speaker: Francesco
    • 12:00 12:20
      Integrating GGUS and Savannah 20m
      A recommendation from the EGEE review was better integration between the JRA1, SA3 & SA1 for issue tracking so that better user oriented metrics could be generated.
      Speaker: Oliver, Francesco, Maite
    • 12:20 12:35
      AOB 15m