28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
email@example.com Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0140768
France: Just a little comment, concerning Cern VOMS SD on monday. It might be interesting to schedule such a downtime another day than "monday". Because, if people wanted to get a valid VOMS proxy during this SD period, it would have to renew it on sunday. I heard that some people were not working on sunday ! ;)
UKI (UKI-SOUTHGRID-BRIS-HEP): cerb-mds is in OUTAGE (until August 2008) according to GOC-DB: "Test StoRM server - should not be used for production yet". We don t want SAM tests running on it. But they are. I ve emailed firstname.lastname@example.org twice to request "no tests please" or find out what is the procedure to not have SAM tests run on a test machine. No answer. Can anyone advise how to contact sam-support to get a response?? Thanks.
<big> Short deadline Jobs: status update and batch system configuration </big>
A Short deadline job is:
- A job with a deadline constraint, which provides some guarantees about its behavior; which is unable to proceed though prior explicit reservation. because they have a short execution time and because they are unexpected and urgent, they cannot be dealt only on a best effort basis in full production regime
- A plain EGEE job in the following sense: it is submitted, scheduled and returned to the user though the standard mechanism governing the usage of the resources. In particular, it can be inspected by the usual tools (WMS trace) and is fully accounted for.
For preliminary information:
- from bug #31278, the WMS is OK since February.
- two sites have SDJ configuration files: LAL (sure) and CEA.
(section 5.2, the rest is not relevant)
A full example file will be available shortly (the LAL one, used for more than one year).
<big> WLCG issues coming from ROC reports </big>
Italy: FTS configuration change at INFN-T1:
Transfer agents for the LHC VOs has been changed so that zero transfer retries are performed.
<big>WLCG Service Interventions (with dates / times where known) </big>
RAL: we are not able to submit our pilots because our rank expression prevents to do so. This is because the number of locally waiting jobs from other VO is high enough to make extremely unattractive RAL CEs.
We know that as soon as we will move to a consistent use of VOView (through gLite WMS) we will be able to steer anyway our jobs there because the rank is computed with VO specific information. The problem is that site admins there claim (at least on Friday)many job slots free and (paradox) an equivalent number of jobs waiting on the ocal LRMS.
VOMS issue: after the intervention on the LCG production Oracle service we had problems in getting voms proxies for other 2 hours. VOMS server didn't recover automatically.