Atlas report for the week 4-10 May 2011 ======================================== RAL: ---- Disk server gdss206, part of the ATLASDATADISK space token: two faulty hard drives on the machine were replaced. This machine was put back in production on Saturday(7th May) afternoon. UKI-LT2-QMUL: ------------- DT (power supply problem) Friday- Saturday morning. Was back in production. srm errors on Monday (https://gus.fzk.de/ws/ticket_info.php?ticket=70384). Was blacklisted and set offline. The problem was solved and QM was back in all atlas activities last night. Was set in brokeroff in analysis by mistake. I set it back online. HC team is asked to investigate the problem. UKI-LT2-RHUL: ------------ was in DT Fiday-Monday morning (network problem). Back in all atlas activities. new queue for new WNs was set up and under test. UKI-LT2-UCL: ----------- was in DT. Still offline in production and analysis. Test jobs run > 13h (instead of 30 Min) and failing. UKI-SCOTGRID-GLASGOW: -------------------- Lost files after power cut. https://gus.fzk.de/ws/ticket_info.php?ticket=70363. Files declared as lost. checksum error https://gus.fzk.de/ws/ticket_info.php?ticket=70358 Sam thinks it's a network problem UKI-SOUTHGRID-BHAM-HEP ---------------------- Problem with SQL server. https://gus.fzk.de/ws/ticket_info.php?ticket=70359. Was set offline and blacklisted. Was in DT. Back from DT last night. Trasfers are still failing. UKI-SOUTHGRID-BHAM-HEP excluded from DDM again this morning. ------------------------------------------------------ From: atlas-ssb-notifications-noreply@cern.ch Subject: [ATLAS SSB Notification] Cloud UK: Daily Résumé (Tue May 10, 2011) Date: 10 May 2011 08:30:34 GMT+01:00 To: ATLAS UK Cloud Support Cc: atlas-adc-ssb-devs@listbox.cern.ch Cloud UK info: UKI-LT2-UCL-HEP brokeroff in analysis activity since Apr 26 09:30 Space token UKI-LT2-UCL-HEP_HOTDISK under 20% (Free:0.196 Total:1.074) since May 4 09:30 UKI-NORTHGRID-MAN-HEP ggus 69336 State:on hold Date:2011-04-04 Info:rfcp failures at MAN-HEP UKI-SCOTGRID-GLASGOW ggus 70358 State:in progress Date:2011-05-07 Info:checksum error in RAL-LCG2_DATADISK UKI-SOUTHGRID-BHAM-HEP ggus 70359 State:in progress Date:2011-05-08 Info:repeated transfer failure from UKI-SOUTHGRID-BHAM-HEP_PRODDISK due to SRM error offline in analysis activity since May 8 09:30 UKI-SOUTHGRID-OX-HEP Space token UKI-SOUTHGRID-OX-HEP_DATADISK under 20% (Free:8.676 Total:58.0) since Apr 29 18:24 UKI-SOUTHGRID-RALPP Space token UKI-SOUTHGRID-RALPP_HOTDISK under 20% (Free:0.15 Total:0.999) since Apr 29 18:24 UK cloud savannah 120817 Date:2011-05-08 11:55 Info:"UKI-SOUTHGRID-BHAM-HEP set offline" savannah 120564 Date:2011-04-26 00:45 Info:"UKI-LT2-UCL-HEP is set offline. The site in downtime"