Operations team & Sites
EVO - GridPP Operations team meeting
Operations Minutes 27/7/2015
============================
Present:
Andrew Washbrook
Andrew Lahiff
Brian Lahiff
Catalin Condurache
Chris Brew
Dan Traynor
Terry QMUL
Ewan Mac Mahon
Federico Melaccio
Gang Qin
Gareth Roy
Gordon Stewart
Govind Songara
Ian Loader
Jeremy Coles
John Hill
Liam Skinner
Matt Doidge
Oliver Smith
Winnie Lacesso
Peter Gronbech
Raja Nandakumar
Robert Fay
Sam Skipsey
Tom Whyntie
1) Experiments
==============
LHCb - Running mostly fine on the Grid in the UK.
- Only one problem at present, at Durham LHCb pilots don't appear to be running.
- Bristol issue resolved from last week
Atlas - Brian notes that spacetokens at Durham, Brunel, Sussex, Imperial will be retired
- Removing Datadisk, expanding Scratchdisk to 4T and the rest will be placed in Proddisk
CMS - No report
Other VO
========
DiRAC - Brian notes DiRAC are still trying to work on longterm proxies
- Jeremy notes we've seen this problem before and asks why it is different this time.
- Brian states documentation doesn't appear to be present for the problem being tackled.
- Also a limit on the number (2000) of files in a FTS transfer ID has been discovered.
LIGO - Catalin notes:
- Paul Hopkins has added needed components to CVMFS, small hitch but the latest release of their software repo has been published.
- Unfortunately Paul hasn't had any time to carry out any job submissions.
- Sam has been moving Paul from the old Dirac instance to the new Dirac instance created by Daniela.
- Sam raised issues with small VO reps. not being on the new gridpp-support mailing list
- Jeremy agrees that all reps should be on that list, Ewan raises concerns about the security/privacy settings, Jeremy woul look into it.
LOFAR - Meeting this afternoon, report next week
LSST - No report
LZ - No report
UKQCD - No contact made in the last few weeks. Email sent from Jeremy
UCLAN - Currently running on Northgrid, job submission via Ganga.
PRaVDA - No report
GHOST - GridPP VO, doesn't have a CVMFS area, Tom suggests they setup there own repo "ghost.egi.eu"
- Discussion about difference between *.egi.eu domains and *.gridpp.ac.uk domains in relation to uCERNVM
- Catalin comments in response to a question from Jeremy that a DN is required to maintain a CVMFS repository, but can be configured to use a Role.
DIRAC (WMS) - Problems with Queues at Manchester.
Meetings and Updates
====================
With reference to: http://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest
* Various requests (e.g. via EGI) to express interest in "Compute Resources for PanCancer Analysis of Whole Genomes" call.
- Discussion about Fed Cloud resources, Catalin mentions that resources at RAL will be part of the Fed Cloud.
- Jeremy shares an update from Andrew that Containers are being included in VAC which may also allow resources to be used.
* MM raised a question about remote loading of Grid files for SNO+ (email 20th July).
- No response, Brian had begun looking into it at RAL
* From Winnie: yaim on CREAM-CE - no update of var/lib/bdii/gip/ldif/ files. Should it?
- Winnie was able to fix the issue by altering site-info.def and re-running.
* Atlas ADC meeting containing discussion of the future of cloud support is at 4.30 on the 21st of July, and sites are welcome to attend.
- Ewan, comments people are happy with how Cloud support functions in the UK and states emphatically "we don't want changes".
WLCG Ops Co-Ordination
=======================
- Sam explains the proposals for access monitoring as part of the HTTP task force.
Tier-1
======
- Outage scheduled for the 4th August for router investigations.
Storage & Data Management
=========================
- No Report
Accounting
==========
- Oxford publishing 0 cores on one of the CREAM-CEs, node has been de-registerd and removed from the GOCDB
Documentation
=============
- No Report
Interoperation
==============
- No Report
Monitoring
==========
- No Report
On-Duty
=======
- No Report
Rollout
=======
- No Report
Security
========
- Security Team meeting last week.
Services
========
- GridPP35 focus on Networking and IPv6, please update IPv6 table.
- RAL outage on 4th will mean BDII & FTS will be unavailable.
Tickets
=======
Brunel
------
- 115113 - Solved
- 114006 - Problems with accounting
Lancaster
---------
- 114845 - Solved
QMUL
----
- 114573 - IPv6 Dual stack issues with failing LHCb jobs, waiting for reply.
Sheffield
---------
- 114649 - Still waiting for a reply from SNO+
Liverpool
---------
- 114248 - Still waiting for a reply from SNO+
Durham
------
- 114381 - Publishing oddities, likely due to SLURM
Tier 1
------
- 113836 - Mismatches in GLUE 1 and GLUE 2 (no operational issues caused)
UCL
---
- 114746 - DPM problems, likely due to SELinux blocking ports.
ECDF
----
- 115003 - Issues with test SEs, problems all seem to be associated with router issues at ECDF
Tools
=====
- No Report
Actions
=======
1) Test Castor with WN-tarball gfal2 tools
- Brian suggested the tests may not have been using the same version as found in the tarball, ongoing.
2) LSST Talks
- No update
3) LIGO strategy, tests, timeline.
- Catalin would look at the wiki page that had been created.
4) AUP Documents
- Ewan suggests this shouldn't be in the AUP but a more formal (or informal) requirement for a new VO.
Chat Logs
=========
Matt Doidge: (21/07/2015 11:21)
*Curses Vidyo*
VAC at Lancaster should be working now, although only with 2 boxes atm
Ewan Mac Mahon: (11:26 AM)
They probably don't, but they should.
But do the cernvm images have gridpp.ac.uk ?
Catalin Condurache - RAL: (11:26 AM)
ewan@ no
Paige Winslowe Lacesso: (11:34 AM)
Jeremy, it's been sorted - posted to tb-support
Thanks anyway for paying attention!!
Ewan Mac Mahon: (11:36 AM)
If anyone's still using the cern cvmfs puppet module to configure their cvmfs they might like to consider using Oxford's cvmfs-simple module instead.
It does things in the new style manner.
Matt Doidge: (11:38 AM)
From an email on this "Indeed if all cloud support teams were like the UK we would not be having this discussion…"
Ewan Mac Mahon: (11:40 AM)
Indeed; the risk though it that the system moves to a more centralised approach without country cloud specific teams.
That would support the countries who have rubbish/non-existent local teams, but really suck for us.
So everyone that cares about running ATLAS work should probably go along and make that point.
I think that's 3:30 UK time this afternoon (?)
Matt Doidge: (11:42 AM)
http://indico.cern.ch/event/433297/
The meeting starts at 15.40 Cern time, the cloud discussion is scheduled for an hour later.
Ewan Mac Mahon: (11:44 AM)
Yup, so tune in about 3:30 UK time and you might catch the end of the main meeting, and be ready for the cloud support discussion.
Federico Melaccio: (11:54 AM)
I can open the page in editing mode
Jeremy Coles: (11:54 AM)
https://gridpp.ac.uk/wiki/IPv6_site_status
Dan and Terry: (11:54 AM)
i can open the editing window but could not save
Federico Melaccio: (11:55 AM)
I see, same here
Ewan Mac Mahon: (11:55 AM)
Seems to work for me.
There should be a test message visible at the top of the page now?
Federico Melaccio: (11:56 AM)
Yes I can see your test message
Ewan Mac Mahon: (11:56 AM)
Try taking it back out?
Probably not everyone at once though.....
I used Chrome, FWIW.
Dan and Terry: (11:57 AM)
ill try a different browser
Federico Melaccio: (11:57 AM)
I used Firefox and it did not work either
Jeremy Coles: (11:58 AM)
http://www.gridpp.ac.uk/gridpp35/registration.html
Ewan Mac Mahon: (12:00 PM)
You're really quiet, Matt.
But as far as I could tell, accurate.
Dan and Terry: (12:01 PM)
very very quiet matt
Federico Melaccio: (12:01 PM)
and a lot of background noise
Ewan Mac Mahon: (12:01 PM)
It sounds like he's wandered off and left his mic behind.
It's not as good as it could be, but it's definitely better.,
Might be worth a bit of out-of-meeting tinkering with it at some point/get a new headset.
Dan and Terry: (12:05 PM)
we were trafsering localgroupdisk to rhul
Chris Brew: (12:07 PM)
we're chugging along quietly on it
Federico Melaccio: (12:11 PM)
it would be helpful to have a draft agenda indeed
Jeremy Coles: (12:13 PM)
https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator
Federico Melaccio: (12:16 PM)
thank you
bye