UKI Monthly Operations Meeting (TB-SUPPORT)
→
Europe/London
EVO - GridPP Deployment team meeting
EVO - GridPP Deployment team meeting
Description
- This is the monthly UKI meeting
- The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area.
- The phone bridge number is +41 22 76 71400. The phone bridge ID is 482161 with code: 4880.
TB-Support 16th October Minutes
Derek Ross
Jeremy Coles
Elena Korolkova
Winnie Lacesso
Mike Kenyon
Rob Harper
Matthew Doidge
Stephen Childs
Brian Davies
Graeme Stewart
Chris Brew
Yves Coppens
Duncan Rand
Gianfranco Sciacca
Peter Love
Stephen Burke
Pete Gronbech
Sam Skipsey
Ewan MacMahon
Stephen Jones
Rob Fay
Simon George
Alessandra Forti
IPP1 Durham
Stephen Dallison
Santanu Das
Site Availability
=================
Problems at RAL-Tier 1 - Downtime
EFDA-JET - unknown
UK Tests
--------
- Brunel - problem with file Steve uses on down filesystem
- QMUL - same problem
Atlas Tests
-----------
Brunel - as above
Imperial (both) -
RHUL -
BHAM - DNS problems, also problematic switch, GPFS software area on one cluster disappearing, replacing a CE.
EFDA-JET - down at the moment
Oxford - intermittent issues with maui crashing
RAL T1 - Banned normal atlas users, Steve tried with a /atlas/uk/Role=production but had problems with T2 sites so switched back
Acounting
---------
UCD - should be unregistered
IC-LESC - tomcat failure, being reinstalled
QMUL - Monbox taken offline
WLCG Draft Availabilty reprt for September - contact T2 coordinator about discrepancies
Experiment Problems/Issues
==========================
CMS
---
CRAFT, lot of MC production, activity mainly at T1. No problem maintaing links in UK, low rate monitoring of links starting. Brunel issue - Atlas space tokens took all space - now fixed
LHCB
----
Gridmap file changes
Atlas
-----
Detector taking comsic rays, high quality cosmic rays taken till end of month.
RAW sizes are bigger, data being take 24hrs rather than 14hr, so data in T0 much higher - buffers stretched. Issues exporting to T1, data sometimes has to be recalled from tape.
MC production bursty - 2-3 days of activity. Problems installing latest Atlas releases at ECDF.
2008 MC AOD being distributed over T2 in next few weeks, need 5TB in mCDISK
Will then have a analysis challenge to get users to use different tools.
Want GLEXEc and SCAS.
Oxford may replicate data, if not centrally distributed.
ROC/WLCG stuff
==============
WLCG Update
-----------
- GDB on T2 issues, experiments (apart from Alice) happy with T2s, some middleware will be brought forward - FTS on SL4, GLEXEC ready but SCAS needs more development. Missing procedures for killing jobs, poor traceability of jobs to catch errors. Expectation for data taking is now April/May
Alice has asked sites to volunteer to deploy CREAM CE, request was made at Weekly Monday Operations meeting
Will Alice accept pool accounts for vobox ssh access?
Gathering Site Information
==========================
Want to gather info about memory limits on batch systems - Atlas are interested in this information, jobs have occasional memory spikes, so need to know batch system policies about killing jobs.
Want to know about problems with site WAN in the last year - capping, outages, anything leading to WAN becoming a bottleneck, plus forward look for 1-2 yearsi.e. increasing WAN connectivity.
Middleware questions from EGEE'08
=================================
iii Acceptable versions
Would like checkpointed releases which would solve these problems
AOB
===
Please share experiences of recent purchases on Wiki
HEPIX in Taipei next week
HEPSYSMAN - suggestions if/when/where/topics for next HEPSYSMAN meeting
conseunsus that it Would be useful to have one
Chat Window
===========
[10:30:02] Winnie Lacesso joined
[10:30:04] Mike Kenyon joined
[10:30:09] Matthew Doidge joined
[10:30:09] Rob Harper joined
[10:30:21] Stephen Childs joined
[10:30:28] Brian Davies joined
[10:31:39] Graeme Stewart joined
[10:31:49] Chris Brew joined
[10:31:50] Yves Coppens joined
[10:33:53] Duncan Rand joined
[10:34:12] Stephen Childs left
[10:34:53] Stephen Childs joined
[10:35:08] Gianfranco Sciacca joined
[10:35:10] Peter Love joined
[10:37:03] Stephen Burke joined
[10:37:32] Pete Gronbech joined
[10:38:56] Sam Skipsey joined
[10:39:45] Ewan Mac Mahon joined
[10:43:52] Winnie Lacesso Same intermittent SAM maradonna errors at LeSC .. probably also still SL3?
[10:46:51] Stephen Jones joined
[10:47:18] Rob Fay joined
[10:48:23] Ewan Mac Mahon For the record the Oxford maui had died again; it's been kicked.
[10:53:48] Simon George joined
[10:56:10] Alessandra Forti joined
[10:58:34] Alessandra Forti I can hear someone talking on top of graeme
[10:58:44] Sam Skipsey We actually think we know why there's an issue at ECDF now.
[10:58:48] Sam Skipsey But it's on going.
[10:58:55] Elena Korolkova the same for me
[10:59:53] Chris Brew left
[11:01:44] IPPP1 Durham joined
[11:02:11] Chris Brew joined
[11:10:48] Stephen Burke I'm not sure if sites can see that page, the access is protected ...
[11:11:04] Alessandra Forti no they can't
[11:11:20] Derek Ross I certainly can't
[11:12:00] Stephen Burke Actually I don't think there's a tier-1 address on it anyway - maybe that should be added ...
[11:17:31] Winnie Lacesso Did the sound shut off? Got not sound.
[11:18:22] Ewan Mac Mahon There's still sound - try turning off with F8 and back on again.
[11:19:04] Winnie Lacesso Brilliant Ewan! Tx.
[11:22:23] Stephen Dallison joined
[11:22:34] Santanu Das joined
[11:23:53] Duncan Rand back in a minute
[11:37:12] Winnie Lacesso I'm listening &Manchester or Durham would be good.
[11:37:26] Rob Harper Still here! I'll come if I can.
[11:37:29] IPPP1 Durham Durham is listening... and would be interested to attend
[11:37:31] Elena Korolkova yes, I'm listening
[11:38:36] Rob Fay Yes
[11:38:38] Stephen Dallison yes to sys man
[11:38:40] IPPP1 Durham left
[11:38:40] Santanu Das Jeremy, will be here for a min?
[11:38:42] Ewan Mac Mahon HepSysMan: yes,
[11:38:43] Alessandra Forti left
[11:38:45] Graeme Stewart left
[11:38:45] Sam Skipsey In principle, yes, in practise, might not make it.
[11:38:48] Yves Coppens yes, could hold it in Bham subject to room availability
[11:38:48] Matthew Doidge yes
[11:38:50] Stephen Jones left
[11:38:56] Derek Ross yes (someone from RAL-LCG2 would be there)
[11:38:56] Stephen Burke left
[11:38:57] Elena Korolkova yes
[11:38:57] Sam Skipsey left
[11:39:01] Duncan Rand left
[11:39:02] Simon George maybe my mic doesn't work ... in principle interetsed in HEPSYSMAN, depends on the date and location as I have a busy schedule
[11:39:08] Yves Coppens left
[11:39:16] Chris Brew yes
[11:39:17] Stephen Dallison left
[11:39:22] Chris Brew left
[11:39:28] Rob Fay left
[11:39:33] Elena Korolkova left
[11:39:38] Mike Kenyon HEPSYSMAN. Attend: yes; hold: probably not
[11:39:38] Winnie Lacesso left
[11:39:38] Rob Harper left
[11:39:52] Peter Love left
[11:40:01] Brian Davies might be able to attend hepsysman
[11:40:06] Simon George left
[11:40:26] Gianfranco Sciacca left
[11:40:55] Mike Kenyon left
[11:41:07] Ewan Mac Mahon left
[11:41:09] Santanu Das hi Pete
There are minutes attached to this event.
Show them.