AFS operation meeting 2014-10-13
present: Massimo, Dan, Jan
Issues:
- BOINC work volume - busy, eventually locked up server (deadlock, need kill -9??). Moved to separate server (afs289), still throttling and one more lockup.
- in touch with users (EricM, Nils, ...), will meet Tue (shifted to Wed) to understand workflow andpotentail changes. Nota: LSF people reported fairly significant numbers of LSF jobs (30k, unclear whether runnign or queued, or whether these touch AFS directly)
- lockup needs looking at (check whether fixed in 1.6.10, have manual coredump from fileserver/volserevr)
- puppetification: afsmisc: AIM will move to VMs, coordinating with Paolo
- monitor/LEMON - will adjust the thresholds for /vice partitions (per volsets) to be compatible with the new LEMON disk full alarms. Discussion - one go or slowly (apparently the scripts might take long to identifify to-be-moved volumes if difference is too big all of a sudden? should still only move 250 vols in one go at max)
- SLS - added "push" cronjob (on afsmisc :-( ) to compy with req from monitoring team.
There are minutes attached to this event.
Show them.