DTEAM meeting Wednesday 23rd March 2005 ======================================= Chair: Jeremy Coles Minutes: Jeremy Coles Present: Fraser Spiers, Steve Traylen, Stephen Burke, Owen Maroney (+ colleague), Pete Gronbech, Jeremy Coles Joined by Paul Kyberd for first discussion. Brunel issues ============= Brunel currently have a problem whereby their boundary routers are set to block port 7777 (used by the DTEAM RLS). Although Brunel network security may allow specific machines use of this port, the system administrators would like a discussion of the issue within EGEE. PK asked if there was a mechanism to raise these issues. JC replied that they could be raised but discussion at this meeting suggested that many well known ports can be exploited not just 7777 (used because it is easy to type and remember perhaps). The DTEAM thought that nothing was likely to change in EGEE on the basis of one site having issue with it. Nevertheless it was agreed that sites need valid and considered responses to take back to site security officers. Action: PK to send JC a report/request on the problems with port 7777 and reasons it should be changed. JC will then ask for a response from the EGEE security group led by Ian Neilson. Review of status ================ ScotGrid: Edinburgh & Durham have upgraded to SL3. Glasgow pool accounts are being done now. NorthGrid: Not represented SouthGrid: All but Oxford at SL3. Oxford will move with 2.4.0. Birmingham and Bristol installing Ganglia 3. London Tier-2: UCL-CCC upgraded to SL3. Imperial – preparing. RHUL – waiting for 2.4.0. Brunel SL upgrade happening now but have the port issue mentioned above. QMUL installing WN (Fedora) with 2.2.0 SN’s (SL3) with 2.3.0 (SL). ST was interested in any experiences with Ganglia 3. He suggested that he may provide a central collection point with Ganglia 3 and migrate the Tier-1 soon after based on experiences. SB mentioned that he heard at an ATLAS meeting that 2.4.0 may be delayed by 1 week. Action: JC to follow up on 1st April release by requesting PMB to state GridPP position clearly - we have sites scheduling manpower for this release and if it is missed we face hostility! Testzone ======== Fraser gave an update on the testzone components: RGMA – something will be in place by the end of this week BDII/RB – setup at Imperial Replica catalogues – Nearly installed. SWE and Prague recently installed and published to rollout and may be able to advise Greame. VOMS – giving an error but working. It was asked who should be doing the administration for the GridPP VO. JC said that Manchester is funded to 0.5FTE for this work. Action: JC to review VOMS work with Andrew McNab and Alessandra Forti. OM asked how we get hold of a beta release of 2.4.0 to run on testbed machines. As nobody knew JC agreed to find out. Action: JC to find out how to gain access to 2.4.0 beta release. Security ======== The recently published security challenges document was reviewed. The general feeling was that it needed more content. Some of the suggested tasks may be very difficult as not all required information is audited or kept. PG asked about documents for sysadmins. *Where are* all the different log files that are needed in respect of these proposed tests! SB suggested the Gatekeeper logs are the most essential but having some well defined "use cases" was agreed to be a useful suggestion to feedback to Ian Neilson. Nb. The security policy only tells you about files to keep not how to use them in the event of an incident. ST reminded everyone that there are security experts in the community who would be available to help trace events at sites if required. It was noted that the RB problem has been fixed but not all reinstallations are using the fix or have not been tested for correct configuration. SB mentioned several new areas of concern regarding security - WLM issues, account recycling ... and asked where these should be sent. ST mentioned a mail from Linda Cornwall that suggested she was compiling a database of vulnerabilities - the question is who will prioritise and address these issues. Action: JC to seek clarification with DPK and Romain Wartel at a meeting tomorrow where the security area of GridPP (roles and responsibilities) will be discussed. Ideally the security officer would take control of new concerns but this person is not yet recruited. Tier-2 coordinators roles and responsibilities ============================================== The PMB have agreed the list proposed by the team. On the issue of whether Tier-2 coordinators should be responsible for any specific sites the answer was no. For the time being, if a coordinator wants specific site responsibilities they must make sure that there is suitable skilled cover in the event that they themselves are unavailable. Action: JC to talk offline with Fraser about position in Glasgow. Current work in progress ======================== SRM --- There was a storage management meeting today so JC asked for an update on Imperial's status. OM said that no progress had been made on dCache but that things would be happening after 2.4.0 is installed. He asked what pressure there was to upgrade storage elements and what is to happen to classic SEs. There is an ongoing discussion (but little action) within EGEE on the issue of SE migration. Clearly the owners of files should migrate their files off of the current SEs. JC suggested VOs could move data to RAL which now has a stable SRM. Action: Tier-2 coordinators to review amounts of VO data held at their sites. Action: JC to use Tier-2 information to propose a strategy for contacting all VOs that would be affected. Find out if there will be a deadline for shutting off classic SEs. Without a strategy sites will end up needing to run 2 storage elements which is not desired. Priorities ---------- ScotGrid – Edinburgh now working on dCache. SouthGrid – Babar work at Bristol and Birmingham London – SGE (David McBride – WLM support post). 2.4.0 upgrade. dCache and Service Challenge. NorthGrid – Expected = dCache deployment and SC3 preparation Meetings ======== Storage management workshop - ST, SB, AF probably attending. Greame Stewart & Steve Thorn also expected to attend. EGEE-3 - all DTEAM HEPIX – has specific topics on batch systems and grid system. Not clear who will be going. HEPSYSMAN – 27th & 28th April (proposed talk on security). Should HEPSYSMAN and TB-SUPPORT lists be combined? PG working on agenda LCG workshop - more information required. Service Challenge update ======================== ST reported that RAL were added yesterday. An aggregate transfer rate of 500 MB/s has been seen but needs to be sustained for 2 weeks. AOB === SB asked about the pre-production service. JC said that RAL do not intend to be part of this at the moment; ST added that it was too similar to the JRA1 testbed at RAL. OM mentioned that Imperial have someone installing a gLite testbed. It was thought this testbed should be a candidate pre-production site not part of JRA1 testing. JC asked if there were any objections to DTEAM meetings on Tuesdays 15:00. SB preferred 15:30 due to JRA1 meeting at 14:00. Agreed move to 15:30-16:30. UK meetings to be held on Wednesday mornings at 11:00. JC will now set up meetings in the agenda system for the next 12 months - avoiding clashes as best as possible. Actions ======= 050323-1: PK to send JC a report/request on the problems with port 7777 and reasons it should be changed. JC will then ask for a response from the EGEE security group led by Ian Neilson. 050323-2: JC to follow up on 1st April release by requesting PMB to state GridPP position clearly - we have sites scheduling manpower for this release and if it is missed we face hostility! 050323-3: JC to review VOMS work with Andrew McNab and Alessandra Forti. 050323-4: JC to find out how to gain access to 2.4.0 beta release. 050323-5: JC to seek clarification with DPK and Romain Wartel at a meeting tomorrow where the security area of GridPP (roles and responsibilities) will be discussed. Ideally the security officer would take control of new concerns but this person is not yet recruited. 050323-6: JC to talk offline with Fraser about position in Glasgow. 050323-7: Tier-2 coordinators to review amounts of VO data held at their sites. 050323-8: JC to use Tier-2 information to propose a strategy for contacting all VOs that would be affected. Find out if there will be a deadline for shutting off classic SEs.