28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
firstname.lastname@example.org Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0148141
OR click HERE (Please specify your name & affiliation in the web-interface)
France ROC: IN2P3-CC is down from Sunday 3rd May 19:00, due to air cooling failure. Most of the grid services have been restarted this morning (May 4th).
An unscheduled dowtime is still active until tomorrow afternoon for CEs and SEs.
SEE ROC: Are there any developments/plans/ideas towards to a high availability mechanism for the LFC service from the development team? From the developers: LFC can be deployed in a HA setup, as it does not hold internal state apart from the database. One can deploy multiple front-ends
pointing to the same database back-end, which are in a load balanced
or fail-over configuration.
Of course in this case the database is a single point of failure,
which one can mitigate by deploying on Oracle RAC or having a
multi-tier LFC setup:
In this case there is one master LFC service, which updates read-only
replicas via Oracle streams database level replication.
In theory one can also think about MySQL based replication (tested
for VOMS, but not for LFC) and also about multi-master Postgres DB,
depending on the actual requirements coming from the sites.
SEE ROC: Which is the current status of the top-BDIIs? Tests we made within the HellasGrid infrastructure showed to us that many of the problems at the current version of top-BDII are solved in the top-BDIIs.
ggus #44104. This ticket is waiting on the OSG GOC to roll out changes to their
production BDII that will publish entries by their OSG resource group,
not the OSG resource name. This will remove this issue before it gets
to the BDII. Next action deadline in OIM is in Feb 2010. Should we close as unsolved to free the escalation reports?
ggus #46988. Site concerned is AGLT2. The ticket is urgent since early March. Still nothing happened since my comment of 2009-03-30.
ggus #47786. Site concerned is Nebraska. Urgent. Submitted 2009-04-08! Some OSG reminders remain unanswered by the site (?)