OU:
SWT2_CPB:
Network
Met with campus networking to discuss plans for network upgrade.
Ongoing internal discussions and planning for internal network improvements.
EL9 Migration
Major EL9 upgrades for Condor-CE, Slurm, and worker nodes have been running smoothly.
Have been consistently running roughly 18K job slots.
Production jobs have been experiencing very low error rates.
Discussing and planning next steps.
Transfer Issue
Discovered transfer requests incorrectly using SWT2_DATADISK as both the source and destination, causing errors.
Ivan connected us with ACT experts for support. Waiting for further details.
Harvester Issue - Drain
The site was drained on Wednesday (2/5) due to an issue with one of the harvesters. Compared to other sites, ours remained drained for an additional twelve hours.
We started receiving jobs again on Friday (2/7).
No changes were made before or during the issue; it resolved on its own.
Waiting for expert analysis to determine the cause.
GGUS Tickets
Continuing to work with campus networking to address this required. We previously held a meeting, opened a ticket in their system to improve their tracking of our request, and maintain regular follow-ups. Awaiting further assistance from their team.
168756
Waiting on more information from ESNet and someone from the state network provider (LEARN). They have concluded the issue is likely in the DE cloud routing.
Storage
Continuing work on storage deployment.