TCG Meeting

Europe/Zurich
600/R-002 (CERN)

600/R-002

CERN

15
Show room on map
Erwin Laure
Description
A phone conference has been set up:
To dial in to the conference call:
Call: +41227676000 (Main)
Enter access code: 0170047.

To join the web conference, click here: https://audioconf.cern.ch/call/0170047.
TCG Meeting Wed 27 June 2007 Attending: Markus Schulz (SA3/Chair), Ian Bird (SA1), John White (JRA1/security), David Smith (secretary), Claudio Grandi (JRA1), Jeff Templon (site rep), Daniele Cesini (site rep), Stefano Belforte (CMS), Cal Loomis (NA4), Andrea Sciaba (CMS) Agenda: http://indico.cern.ch/conferenceDisplay.py?confId=16644 [These minutes only reflect the first hour of the meeting; the latter part was not minuted] ++ Summary of new actions or decisions Testing of BLAH parameter passing was considered as failed - the plugin support for passing the parameter to the batch system does not yet exist in all cases. Job priorities working group are currently planning on having the deny tag honored by the WMS in a time scale of 2 months. EMT should take on a standing agenda item to review if components' dependencies are meeting the glite restructuring criteria yet. Decision: All further work on WMS should go into SL4/VDT1.6 (not SL3). This may impact the time scale of the availability of a WMS that honors the job priorities WG's deny tags. ++ Description of meeting ** Activity on previous actions Relating to 5062 'Check whether INFN is still responsible for LSF plugin and info provider': Chair/SA3 said that he had resent the execution plan (concerning LSF support), but didn't get any response from INFN. The contact would have been Elisabetta Molinari but she is busy working on the WMS. So it will need to be discussed again who will be doing the LSF work. It will not be done at CERN. Task 5061 'Check APEL status on ETICS'. Security rep said now building on ETICS without problem, the previous trouble with the java compiler was solved. Task 5060 'Explain glexec to sites': Security rep explained that Alessandra has posted an email. Oscar and Gerben will add text as well, and eventually it will be included in the docs. See text of task 5060 for details. Task 4956 'Check timescale for common authorization interface': Security rep said that it was promised by globus for July. Some people in Fermilab could check with globus to see if this can indeed happen. Oscar has started his work already but needs the alpha version of the library. Task 4954 'Define batch systems information providers process and responsibilities': Still to do. Chair/SA3 would do it as part of the milestone document that SA3 has to do as part of the updated release documentation. Task 4953 'Inform about progress on ROOT problem when accessing DPM data': Sophie will add information; the problem was related to multiple versions of the RFIO libraries. An approach has been written down by Jean-Philippe and Flavia and there is now an agreed to handle the problem using that approach. Paths have to be set correctly, not exactly a nice solution but it should be workable. It will still be quite a time before there is a common RFIO library. Task 4937 'Check the status of VOMS without globus': Close. Task 4217 'Extension of the short deadline jobs working group': NA4 said it is ongoing. Task 3895 'Testing of a BLAH that can pass parameters to the batch system': Alessandra has sent an update. She reports that she needs to write some scripts. SA1 was surpsied that she needed to write something in order to test the functionality. There was then some extended discussion: SA1: Stated the supported batch systems are LSF, Torque, Condor & Sun grid engine. JRA1 had a couple of points: Alessandra is part the group providing the BLAH parameter passing, she is not only a tester. JRA1's second point was that it was intended that BLAH would provide the mechanism to pass the parameters, but the possibility of actually doing that depends whether the plugin supports it. Site rep. reiterated that it was important to get parameters passed, in particular the max wall-clock time requested by the job should be passed to the batch system. SA1 suggested that formally it should be recorded that the tests failed as the functionality doesn't exist at this time. SA3 made the point that we should be sure not to state that BLAH is the only interface interacting with the batch system; the information providers also interact with the batch system and do not use BLAH. JRA1 noted that - he had unintentionally misstated this during one presentation. SA3 requested a detailed description of the gLite CE as currently implemented; suggested a slot in the next TCG. Site rep asked if somebody (perhaps Laurence) could evaluate the existing BLAH plugins, to see how far they are away from doing what is needed. (But not to actually do any development). ACTION: To SA3 - check the state of the existing blah plugins to see if further development is required. [Subsequently task 5217 was created with covers this] ** News from the "Job Priorities Workgroup" Andrea (CMS) was at the meeting: He reported that the very short term solution was clear and was not discussed further. The short term (in the next 2 months) was discussed; to have the deny tags working. Dietrich has sent around the minutes of the meeting. The WMS is expected to be ready to interpret the deny tags - that is not the checkpoint release but the one after that. There is also something to fix in the generic info provider, related to the order in which the access control rules are published in the information system. The experiments were asked to clarify their use cases. There was no discussion about the media term, that should be discussed in the next meeting on Friday. ** glite restructuring, organizing the "dependency challenge" Chair/SA3: We have reached the stage where developers should look at their components. Chair/SA3 has sent around the first draft of acceptable dependencies. That is only a draft (prepared by Joachim). SA3 is able to look at the existing dependencies and see what a component depends on, or can go the other way and check who is using a dependency. Chair/SA3: The main problem is how to organise the teams. Could be people from JRA1 looking at other people's components. The teams go through the list of dependencies. And investigate whether they fulfill the criteria and discuss with developers what could be done to remove dependencies. JRA1: Will discuss outside the TCG exactly how to organize and start that. Chair/SA3: Would like that JRA1 distribute at least a subset of the criteria to the developers. Can followup in the EMT. Should become one of the fixed agenda points during the EMT. ** WMS and glite-ce Chair/SA3: The WMS checkpoint patch 1167 is being rolled out at CERN now, to replace the production RBs. In parallel we have started work on 1203. Will give something to Imperial College for a first test. But the real checking will only be done on SL4, VDT 1.6 builds in ETICS. Experience shows that it takes at least 2 months to get a checkpoint release through the whole machinery. So it means we have to do the cut now and start looking at VDT 1.6 SL4. Nobody has tried that so far. The build system reports that it builds, but it has never been installed. Would suggest to the developers to move to that platform. To move to something that is still using the legacy platform given it will not be in production for at least 2 months is dangerous. So now is the time for the developers to move to the new platform. There was some discussion if 1203 should be released for SL3 or SL4: SA1 / SA3 said SL4. JRA1 was requested to consider that 1203 will need to be certified for SL4 only. Decision: All further work on WMS should go into SL4/VDT1.6 (not SL3). CMS: We should let ATLAS know that 1203 (with the deny patch for job priorities) will therefore probably not be ready within 2 months. JRA1: I'll ask the developers to retest 1203 on SL4. Will retire 1203, but will deliver something for SL4. [... Remainder of today's discussion not recorded ...]
There are minutes attached to this event. Show them.