Goal: reduce the number of people that have root access on the boxes, and separate roles from eos-admins
The proposal is to have 3 e-groups:
The idea is to prevent the following spaghetti-like git history that is very hard to follow, difficult to "merge", error prone and time consuming on top of that.
Having a clean history is important in order to understand what happened and what changed in a repo. Fixing merge problems and conflicts is very time consuming (it took me ~2 hours to fix the first spaghetti plate)
=> Mimicking the AI workflow
Stop pushing on master (except for emergency changes)
Retirement: Several EOSPPS (and some "gateway") machines need to go until mid-November. Need to get some (12?) non-diskserver 10Gb physical machines as gateways (FTP, SMB), under discussion with Bernd, with dedicated uplink.
"Marek" incident : incrementally copied logfile, needs update, compact = schedule for in 2 weeks? (good, get several bugfixes).
v4.1.30 is in qa, production on Monday: CRM-2426
If OK, this version should go to desktops (tickets or "magic" KOJI tag?)
Upgraded yesterday to 4.1.30 (almost) in time.
Uncovered some issues, fixed in master:
EOSLHCB is already running 4.1.30, EOSPUBLIC will likely be updated to the next 4.1.31 once released.
Further planning: 4.1.31 is getting serious "testing", timeframe for updating EOSATLAS (affected by imbalance between Meyrin/Wigner) and EOSCMS is January. EOSUSER still unclear, needs more testing (EOSBACKUP will go first, then get the new namespace; EOSUAT being set up to test for EOSUSER).
One "big memory" headnode has gone missing - have 2 (with spinning disk) that can get recycled,but would like bigmem+SSD for EOSALICE.
EOSUSER is still using the per-machine Kerberos principal (until next restart), SWAN dropped them (now back)
- incorporate local cache cleaner
- enable global byte-range locks
- enable global fsync coordination (open delayed until any client finished his fsync)
- add max file-size support returning EFBIG
- adding rudimentary quota honoring on client side
Progressing as planned.
Server-side is assumed to be ready on Monday (Luca would like pre-/post tags ..)
CDo we set up some "experimental" area on some instance (on EOSUAT?)
"eosfusebind": need to keep compat, even if no longer needed.
Samba update this morning (tons of RPMs from 7.4 distrosync) left this in inconsistent state, OK after restart (also affected SWAN).
Webcast uses this, and had "VIP" HR webcast shortly afterwards = ticket. SMB service status needs to be sorted out (Massimo?), also need to add (more) monitoring.