Site has been running well
- Full of Atlas jobs (MCORE, SCORE, Analy and Opportunistic)
- Good efficiency
Illinois down for PM on campus cluster
Updated glibc pushed to all nodes
New Disk at UChicago
- Ceph based
- Migrating LOCALGROUPDISK
- Lincoln is using gfal-copy and srm to copy from dCache to Ceph but slow
- Currently migrated 193TB out of 368TB
- Need Kernel 4.4 to fix controller problems
OSG 3.3.9
- All head nodes have been using 3.3.x stack for a long time without problems
- CE (HTCondorCE)
- Squid
- CVMFS servers/clients
- GUMS
- Condor 8.4.3
- Still using 3.2.35 on worker nodes
- Testing new LSM
- Used GFAL2 via xrootd, then srm, the fax to try and stagein the file
minRSS and maxRSS now set
- MCORE needs 24GB for reprocessing jobs
- Changed HTCondorCE to request RSS of 24GB (previously 16GB)
- Many nodes are only 2G/core so this can cause idle cores due to lack of free memory
- Might create MWT2_MCORE_HIMEM to handle jobs > 2GB core
- Redirect to only node with 3GB or more per core
- MWT2 has almost 4000 cores which fit this criteria