In the Calico network configuration the modification of the parameter IP_AUTODETECTION_METHOD (which was the possible suspect) was going through, but looking in the master node Calico pod, it was not showing that the update was propagating correctly (looks like something was overriding the change).   
Lincoln suggested that it might be Calico operator, running in the background, and indeed, making the update on the operator level flipped that pod to healthy. Right now all K8s components are healthy. Though I still have submitted jobs waiting at the ContainerCreating state. I think I know what's the reason - working on a fix.