● EOS production instances (LHC, PUBLIC, USER)
EOSPUBLIC
- Crashed this morning, deadlock in FuseServer::Client::BraodcastConfig (EOS-2954)
- Failed over to the slave that was luckily in sync
- FSTs almost all crashed at the same time (possibly EOS-1934)
- crashed in 4.4.0..4.4.3, fixed in 4.4.4. Herve will update FSTs.
● EOS clients, FUSE(X)
(jan):
- eos-fuse(x)-4.4.0 in "production" since today. Nothing (newer) in "qa"
- Notable bugs:
- EOS-2894 : "ssh -X" (actually, "xauth") hangs FUSEX. Req for "$HOME" usecase. Have test
- AMS is waiting for green light to repeat their MGM-crashing test (was on hold until EOSPUBLIC would be on 4.4.X; unfortunately EOSPUBLIC crashed this morning - unrelated?) OK to go ahead?
- Luca: please test against EOSPPS, if possible for AMS
- Next instance(s) to go FUSEX, at least in "qa"?
- EOSPROJECT status?
- have HW, have no time, Jan to set up, Luca to give name+hostgroup.
● Development issues
(Georgios)
- Just a few hours ago: eoshome-i03 was DOS'ed by a user doing "recycle ls". Looks like the command was taking too long to complete, timing out, and automatically retried, again and again. EOS-2955
- is due to core xrootd-client timeout+command resend (same as dumpmd, find). Would need to be rewritten "using new mechanism"
- "eos console" could turn off retries but also would turn off read recovery, for all commands..
- Hugo will set env variable for web interface to turn off retries (to be provided by Elvin)
- Fix for namespace bug, where an mkdir was able to "shadow overwrite" a broken symlink (caused to have both file+dir entry in namespace, caused "find" to crash MGM).
- in case of clash, would need to rename directory (which causes the file/link entry to become visible). Georgios will write tool to check existing namespace(s)
- QuarkDB 0.3.4 has been released:
- Fix for a bug which does not affect EOS.
- Updated rocksdb to latest release.
- (is linked statically, no new RPMs)
- Full release notes here
● AOB
Reminder:
- Andy Hanushevski at CERN this week - discuss high-prio Xrootd bugs/features with him.
There are minutes attached to this event.
Show them.