ScotGrid Technical Meeting

Europe/London
Other Institutes

Other Institutes

Gareth Roy (University of Glasgow)
ScotGrid Technical Meeting 30 April 2014

 

-

 

Andy Washbrook

David Crooks

Ewan Steele

Gang Qin

Gareth Roy (chair)

Oliver Smith

Sam Skipsey (minutes)

 

-

 

- Durham

 

EMI3 ticket.

 

Ewan: just the EMI2-3 upgrade. Discovered good things in Ops - Chris

Brew pointed out that APEL service is not needed if you have an

ARC-CE.

The SE needs a physical upgrade, but not a package-upgrade (just the

EMI-release updated).

CEs “progressing”.

 

Ganglia + Nagios installed, logging stats. Can sort out access when finished.

 

Gareth noted that at one point “unified monitoring” was mentioned.

 

Oliver: not progressed with WMS issue, as it seems to have gone wrong

elsewhere in the ARCs. Getting close to production ready (the GridFTP

seems to be flakey without logging).

Gareth noted that the issue may be the ARGUS-auth bug that caused it

to timeout and fail some connections (the workaround was to remove the

caching in ARGUS).

 

- ECDF

 

EMI3 ticket.

 

Andy: Mostly focussing on the EMI ticket at the moment. Deployed new

CE, new ARGUS server, using them as a pivot to upgrade existing CE.

Ready to go live in the near future (pending new hostcert for ARGUS

server).

Anticipate no problems.

Was caught out by the APEL upgrade (EMI3 release is completely

different architecture to EMI2).

 

Some issues with migration of pool accounts/integration with ECDF

systems team implementation in NIS.

 

Monitoring is “next step”. Nagios/Ganglia, “advanced monitoring stuff

from Glasgow”.

 

Some PANDA auto exclusions but only blips.

 

- Glasgow

 

Dave C:

 

LHCb pub. ticket is ongoing (pending dependant tickets on actually how

to sensibly publish values in Glue2 schema.)

EMI3 ticket about to be closed, as last CE was moved to EMI3 as of

last night. Last step is to reintegrate the APEL publishing and turn

it on (tested from CE -> APEL, but not APEL->RAL). Once enabled, just

need to gap publish for April to sort out messy records.

 

Monitoring; have some ideas for integration, poss via iframes as

lightweight solution.

Need to remain flexible.

 

 

Gang: Set up an ARCCE, published in SiteBDII. Connected and tested

with testq v prod PBS (need to test with real grid jobs). Also have a

test Condor pool, which the ARC has been tested against.

 

Sam: (DPM Collaboration/Dev meeting).

 

 

Cloudsoft - is anything happening? (No news from them to anyone.)

 

Scotgrid NeoWebsite.

Dave C has been reworking the ScotGrid website. There is a

beta-release (demoed in the meeting).

Request for comments on content (sites comment on their own

descriptions). Also contribute any local user communities you want

included.

Comments in the next few days pre: presentation of Beta to Dave B.

 

(Next step is the Documentation, which is accessible internally. This

is being generated using Sphynx, managed via Git.

Andy asked about avoiding duplication of effort, as he’s also working

on a similar, ECDF local thing.

Gareth noted that our model was to use Git to manage distributed

effort on the Documentation (rather than a Wiki, which is very heavy

for this kind of thing).

(ECDF+Durham using Confluence presently, so migration tool needed?)

)

 

GridPP-Monitoring blog:

Dave C has started the above blog, as a news/update site about what’s

happening in monitoring. If anyone is interested in contributing…

 

Decommisioning CREAM-CE

Ewan: who do I talk to to get ATLAS to migrate from CEs to new CEs?

Gareth: email ATLAS_UK_CLOUD support (but in general, also mentioning

the question in Ops would not be useless for other sites in any case).

 

 

-

David Crooks: (30/04/2014 11:04)

No, Sam's just getting sorted out

Andrew John Washbrook: (11:06 AM)

opz team rulez

Gareth Roy: (11:06 AM)

eord up

word up

:(

http://argus-authz.github.io/doc/EMI-Argus-SysAdminGuide-1.1.0.pdf

Known issues section

not sure if it's still a problem

Andrew John Washbrook: (11:16 AM)

I'll pass that on to Wahid!

ooh logo

Gareth Roy: (11:20 AM)

Dave did a lot of work on it... cool eh?1

Andrew John Washbrook: (11:20 AM)

indeed

Ewan: (11:22 AM)

is this accessible by us?

yeah we're using confluence and documenting durham there

 

Samuel Cadellin Skipsey: (11:34 AM)

atlas-support-cloud-uk@cern.ch

Probably.

David Crooks: (11:35 AM)

Cloud support: atlas-support-cloud-uk@cern.ch

There are minutes attached to this event. Show them.