AFS meeting 2014-12-01
present: Dan, Jan
Issues:
- sysctl was broken on puppet, in particular the "big UDP buffer"-tuning got lost. Was most likely the reason for the BOINC-type instability (lost UDP packets can mess up the RX protocol?). Fixed, deployed, restarted all fileservers
- "arc" command got blocked, probably due to slow response (and suspect client firewall issue, timing out). Due to "choose_disk" still trying to contact afs279 (although marked as "- obsolete"), several tickets. Workaround: "X" out the volsets on dead servers
- AFS operator procedure needs reviewing, "console" account will disappear. Agreed to tell them to just (soft) reboot in case of failing pings
- Ben Kaduk reviewed several patches on gerrit, 'not going into 1.8' is predominat answer (but no reasons given?)
- correct_acls still triggers OOM killer, trying to debug via multiple graphs, graphite has some issues..
- "/p/etc/on" seemed to no longer work from cronjobs from afsmisc (but does so for monitoring, from afsadm2). Most jobs affected are probably useless (ABS cleanup needs a fix to run everywhere; sysid backup no longer seems useful.