UKI Monthly Operations Meeting (TB-SUPPORT)
Thursday 31 July 2008 -
10:30
Monday 28 July 2008
Tuesday 29 July 2008
Wednesday 30 July 2008
Thursday 31 July 2008
10:30
Site stability
Site stability
10:30 - 10:45
- Regular look at current monitoring results. This morning's picture: -- SAM (http://pprc.qmul.ac.uk/~lloyd/gridpp/ukgrid.html) --- RHUL --- UCL - known --- Cambridge --- RAL-PPD -- Storage availability --- http://www.gridpp.ac.uk/wiki/GridPP_storage_availability_monitoring -- ATLAS tests --- IC second cluster --- RHUL --- UCL --- Manchester second cluster --- Cambridge --- RAL (recent) -- LHCb tests --- IC --- QMUL --- UCL --- Durham --- Glasgow --- Culham --- RAL-PPD -- Transfer tests (status) --- Current tests are being moved to a new scenario. Present results are therefore less useful: http://pprc.qmul.ac.uk/~lloyd/gridpp/nettest.html -- UK wide tests --- http://pprc.qmul.ac.uk/~lloyd/gridpp/uktest.html --- Increased ATLAS work has led to changes in ordering but overall a similar pattern. --- There is a high failure rate at Durham, Cambridge and Brunel -- Accounting --- http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.php --- Charts were missing from production portal last week --- EFDA-JET not publishing since June --- IC-LeSC several weeks behind --- Lancaster? --- Oxford? -- If time permits a quick look at the main reasons for availability/reliability problems from: http://www.gridpp.ac.uk/wiki/SAM_availability:_October_2007_-_May_2008. --- No entries for: UCL-HEP; Lancaster; Manchester; Sheffield; Birmingham; Cambridge; EFDA-JET; RALPP; Tier-1 and csTCDie -- B Britton noted this morning that the UK picture does not look very healthly in GridMap: http://gridmap.cern.ch/gm/. It looks like many sites are in maintenance. Is there any attempt within Tier-2s to schedule downtimes with a view towards Tier-2 availability?
10:45
Experiment progress and plans
Experiment progress and plans
10:45 - 10:55
- Review of what has been happening and what happens next. - LHCb -- Currently reviewing software installations across sites -- SAM tests now on SL pages. Off with move to DIRAC3 - ATLAS -- - CMS -- - Other VOs -- superNEMO have started picking up activity across the sites (for some reason not in APEL) -- CDF are wishing to be re-enabled at supporting sites -- UKQCD are working with RHUL (Duncan et. al.) in the first instance as they have high memory requirement jobs -- The gridpp VO should be enabled at all sites by now!
10:55
Update on CA matters
Update on CA matters
10:55 - 11:15
- Additional information surrounding the UK CA certificate changes - Opportunity to raise any problems or concerns -- Communications are one area to be looked at
11:15
Hardware purchases & middleware upgrades
Hardware purchases & middleware upgrades
11:15 - 11:25
HARDWARE: - All? sites now have their GridPP hardware grants - The PMB wants to encourage sharing of procurement information - The following page has been set up: http://www.gridpp.ac.uk/wiki/Guidance_and_recent_purchases. Please use it! - One area of concern is the use of a new benchmark - SPECall_cpp2006 is made up of 3 apps from specint and 4 from specfp, (7 apps), can run it in 6h, but no published values. Proposal, is to use the cpp benchmark, a script will be made available. - For Pete G's summary see: http://www.gridpp.ac.uk/wiki/GDB-July_2008 - For the GDB talk see: http://tinyurl.com/5z349j MIDDLEWARE: - Anything to discuss? - Release news can be found here: https://twiki.cern.ch/twiki/bin/view/LCG/OpsMeetingGliteReleases - PPS release information: https://twiki.cern.ch/twiki/bin/view/LCG/OpsMeetingPps (includes CREAM and glexec modules)
11:25
AOB
AOB
11:25 - 11:30
- Recent port scans (discussed on storage list). Check your logs. - Please make progress with Nagios!