Date: 2008-08-04 Attending: Alessandra, Elena, Rob, John, Matt, Owen, Peter, Stuart. Agenda: ------- Regular topics: =============== - Site Updates * Lancaster: Disk servers went down failed SAM tests over the weekend, no data loss. Electrical tests, 100TB and 126 job slots. Atlas (Peter) moved the data from dcache to DPM, but there wasn't much to salvage. * Liverpool: Moving to DPM, had problems with SAM testing when firewall ports were not open. They tried to remove it from GOCDB but SAM caches the entries for 3 days. The way to remove a service 'correctly' is to put it in downtime and then remove it. In this way SAM doesn't test it because of the downtime and then because it is removed from the cache. There was a ticket open for failing FTS transfers but has been solved this morning. * Manchester: Manchester is also moving to DPM and it's considered much easier to install. Setting up simple space tokens with directories with correct permissions took only 3 hours. Nothing comparable with the pain dcache is. At the moment only unpublished testbed, tested without BDII. Resolved also the problems Manchester had with Atlas production within the panda framework. Manchester clusters are now represented in Panda as different sites and this has allowed the SL4 cluster to regain a 95% efficiency and the SL3 cluster not to be used for production. * Sheffield: Had problems with the University DNS and with disk on the Mon box. - Atlas Updates: * Main thing to be aware of is about the new requested space tokens. Brian has sent out a ticket. * Production will start again next week. * Castor has had problems with the backplane and the solution was to throttle the FTS channels. - Lhcb Updates: we don't have any representative for Lhcb. Rob will ask (David?) if he can join. - Dteam Updates * DNS/Bind security bug (cache poisoning). Has everybody received the email from Mingchao and replied as requested? Lancaster, Liverpool yes, Manchester no they have given a second email address to Mingchao (avoid changing in GOCDB because of SPAM received from security mailing lists), Sheffield don't know need to check. * CA old certificate expired and removed from VOMS servers. Users might receive an email from CERN or other VOMS locations like DESY saying they don't belong anymore to the VOs. This concerns only their old certificate, not the new one. If they have already registered the new one in the VOMS server they don't have to do anything else. If they haven't they will have to register the new certificate again. Hopefully the CA will send an email out. * GridPP is looking at sharing information about the storage http://www.gridpp.ac.uk/wiki/Guidance_and_recent_purchases * Lancaster, Manchester, Sheffield should comment on: http://www.gridpp.ac.uk/wiki/SAM_availability:_October_2007_-_May_2008 * Nagios: setting up of regional instances. We have already discussed this last time, but it periodically comes out at the dteam meetings. Month topics: ============= Agenda: Proposal to have regular topics as listed in these minutes and then add topic of interest each month. Topic of interest from everyone. AOB === Apologies for the audio, it was terrible and made the start very difficult. Next Meeting: ============= 01/09/2008 Actions: ======== 080804-1, Alessandra: send the gridpp links for the storage and for the availability. In the minutes. 080804-2, Alessandra: repeat by email what she said users should do about the certificates. In the minutes. 080804-3, Rob: Ask if anyone in Lhcb is willing to give us operational updates.