ScotGrid Technical Meeting

Europe/London
Other Institutes

Other Institutes

Gareth Roy (University of Glasgow)
ScotGrid Technical Minutes 9 April 2014
Present:
Andy Washbrook
David Crooks
Ewan Steele
Gang Qin
Gareth Roy (chair)
Oliver Smith
Sam Skipsey (minutes)
Tickets

- Durham
Ewan:
EMI2-3 upgrade plodding along.
Almost have working ARC-CE.
 
Virtualisation has caused problems this week. Yesterday, power
migration caused problems, as the redundant power supply turned out to
be less redundant. Network issue causes the ESX hypervisor to lose
contact with the network storage hosting the storage for the volumes.
(This caused horrible problems for all of the services.)
At midnight this night, the same thing happened.
 
Lesson learned: sharing your VM hosting network with the cluster network is bad.
 
Almost at the point of launching new services.
 
-
ECDF
-
Andy: mostly working on EMI2-3 upgrade.
This morning we had to back out of a change. Seems that if you create
a new pool account home directory from scratch, the cream ce seems to
try to create new files in it...
Once we've fixed this, we can then bring up the new EMI3 CE.
Targetting end of next week for completion.
 
Had problems (last week) with APEL server. This was a VMware issue -
resolved by rebooting the server until it decided to work.
 
OpenSSL issue.
[] Renewal is sufficient.
 
 
-
Glasgow
-
Dave:
MaxCPUTime discussion with LHCb (now really a GLUE2 standards question).
 
EMI2-3 (ongoing). Going well at the moment - now have updated another
batch of worker nodes (rate limiting step is drain time).
Once complete, the next step is the CEs - we believe that, having
tested on Svr014, the update itself works fine.
Final step would be APEL - we have a new puppetised APEL box in
development at present. Currently talking to Stuart Pillinger as to
how most effectively migrate (esp as we'd rather not migrate our huge
existing database).
[Can also test beforehand with our test CE which is already EMI3]
 
Planning to move the UI from EMI2-3, but this possibly involves deeper
structural changes, so this is not rapid.
 
BNL Transfer ticket: updated, waiting for reply from the submitter.
 
Gang is going to look at a test of ARC on our testbed as part of our
future look.
 
-
Individual updates:
-
 
Gang: have built new SL6 test machines. Puppet module set up to
install nordugrid packages.
 
Sam: CEPH / RHEL7 on disk034 (test cluster disk node). Need RHEL7 beta
for the newer kernel.
 
Gareth: looking at setting up a jabber control room.
 
Dave: building a new web server, as part of that, taking the chance to
modernise our approach. One of this modernisations is our new ScotGrid
website (currently on a test server). At the moment, in draft form,
(link given in chat log). Generally the old content with new
presentation - comments welcome (the form is nonfunctional, for
display purposes only). Site built with bootstrap.
 
The other part is the documentation. Current plan is to move from a
wiki-based system (given that we are not a classic wiki usecase) to a
Sphinx-driven system with a static site builder, managed via git repo.
 
Andy asked if the monitoring will be incorporated into the site.
(This is not decided yet - but the same templates are certainly going
to be used for new monitoring stuff.) Re: "Scotgrid view" of
monitoring.
 
Oliver: spent the week "breaking" ARC - it now works with ARGUS.
(Gareth noted that it would be useful for Oliver and Gang to talk).
 
-
AOB
-
 
Quarterly report. Mailed around end of week.
 
 
Pheno user has used up space on WMS svr022. They need to clean up
their space (by retrieving their job output).
 
Ewan asked what the plan for decommissioning WMSen is. (Pheno want to
move to DIRAC?)
The plan was to set up the Imperial/London DIRAC instance for people
to use to migrate (before we can possibly think about turning off
WMS).
Ewan should talk to Janusz Martyniak to use the GridPP DIRAC, or try
setting one up locally for Phenogrid.
 
The other potential migration approach might be to directly submit to
ARC at Durham and Glasgow (as the principal sites supporting pheno
jobs).
 
Dave noted that, in terms of actual decommissioning, it's only CERN
currently talking about turning off WMS services.
 
From a Glasgow perspective, if Pheno moved to DIRAC, it would
significantly reduce the number of users for our WMS services.
 
Ewan asked if Glasgow would be prepared to host a DIRAC instance.
The problem is that we have no experience in this. Ewan offered to
develop puppetising tools to help with this (Glasgow is happy to host
a DIRAC with hardware).
 
 
-
Chat Log
-
Andrew John Washbrook: (09/04/2014 11:01)
hello
Ewan: (11:01 AM)
hi andy
Andrew John Washbrook: (11:03 AM)
here: cloudy with a bit of rain. Top temperature 8C
Gang Qin: (11:04 AM)
The macbook's camera always jumped out when entering a vidyo room,
anyone knows how to turn it off?
David Crooks: (11:04 AM)
It depends on the meeting
(Or should do - there's a meeting option)
Ewan: (11:05 AM)
ganglia reliably informs me that its 12.67C on the roof here
David Crooks: (11:05 AM)
Sam's here!
Andrew John Washbrook: (11:05 AM)
only found yesterday how to turn off the annoying tone when people
enter and leave
Gareth Roy: (11:06 AM)
Ugh
From Advisory
Sites need to patch vulnerable systems, with priority given to servers
exposing SSL services, not forgetting to restart the services
afterwards.
 
Sites will then need new certificates for the previously vulnerable
hosts.
 
The vulnerability also affects client software that uses OpenSSL,
which means that clients that connect to a malicious server could
suffer from information leak.
Samuel Cadellin Skipsey: (11:17 AM)
To siummarise: renewal is a rekeying process, not a resigning.
Ewan: (11:17 AM)
yeah were in the same boar
boat*
Samuel Cadellin Skipsey: (11:17 AM)
(So renewal is sufficient.)
Ewan: (11:25 AM)
2 min audio issues
Andrew John Washbrook: (11:26 AM)
what kernel in RHEL7?
ah
David Crooks: (11:28 AM)
Andrew John Washbrook: (11:28 AM)
nice!
you are missing the under construction gif
Gareth Roy: (11:29 AM)
Marquee!!!!
Ewan: (11:29 AM)
is this a bootstrap site per-chance
so we have a confluence site that isnt yet availiable outside durham
if you want to link to it
can anyone say iframe
welcome to grid middleware
Andrew John Washbrook: (11:33 AM)
"no common sense since 2003!"
Ewan: (11:34 AM)
yeah thats fine we can show a nice bump later
with arc we will publish new figures later anyway
There are minutes attached to this event. Show them.
    • 11:00 11:15
      Tickets 15m
      • Durham 5m
        https://ggus.eu/?mode=ticket_info&ticket_id=102199
      • Edinburgh 5m
        https://ggus.eu/?mode=ticket_info&ticket_id=102201 https://ggus.eu/?mode=ticket_info&ticket_id=95303
      • Glasgow 5m
        https://ggus.eu/?mode=ticket_info&ticket_id=102914 https://ggus.eu/?mode=ticket_info&ticket_id=102202 https://ggus.eu/?mode=ticket_info&ticket_id=101565
    • 11:15 11:25
      Research Updates 10m
      • Codebase - CloudSoft 5m
    • 11:25 11:35
      AOB 10m
      • Quarterly Report 5m
      • Pheno & WMS 5m