- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !
Zoom information
Meeting ID: 996 1094 4232
Meeting password: 125
Invite link: https://uchicago.zoom.us/j/99610944232?pwd=ZG1BMG1FcUtvR2c2UnRRU3l3bkRhQT09
Updates on US Tier-2 centers
UM site had IPv6 issues after the hardware maintenance from Merit, we had to put the UM condor cluster offline to prevent failing more jobs. The issue was resolved the next day by Merit.
We found out a condor ce sub directory ownership issue on the condor-ce which had been causing 20% SAM test jobs fail(Site has only75% Reliability and Availabilty in May). That ownership issue was introduced in late April when we were trying to fix the ownership for the gratia directories.
yum update
of all packages on the AF login and head nodes, including the latest mainline Kernel from ELRepoIn the Calico network configuration the modification of the parameter IP_AUTODETECTION_METHOD (which was the possible suspect) was going through, but looking in the master node Calico pod, it was not showing that the update was propagating correctly (looks like something was overriding the change).
Lincoln suggested that it might be Calico operator, running in the background, and indeed, making the update on the operator level flipped that pod to healthy. Right now all K8s components are healthy. Though I still have submitted jobs waiting at the ContainerCreating state. I think I know what's the reason - working on a fix.