WBS 2.3.1.2 Tier-1 Infrastructure - Jason
WBS 2.3.1.3 Tier-1 Compute - Tom
- Testing Condor v24 LTS configuration on gridgk03
- Some issues with jobs being evicted after 2 hours. Condor developers have been contacted and are providing support
- All WNs upgraded condor 24.0 LTS and Alma Linux 9.5, operation of workers has been smooth
WBS 2.3.1.4 Tier-1 Storage - Carlos
- Database hardware issue affecting Pinmanager, Bulk, TransferManager and SpaceManager services
- Degradation of service mainly affecting WRITEs (02/01/25 5PM EST)
- Service recovered 02/02/25
- Activity on synchronizing internal accounting (spacemanager) tables after restoring the service
- Enabling JumboFrames on all doors and storage servers for ongoing Capabilities testing
- Bulk service restarted on 02/09/25
- 130k staging requests stuck in QUEUE state
- After restarting the service the requests were submitted to HPSS. The entire workflow is working as expected. A follow up ticket created to dCache devs https://github.com/dCache/dcache/issues/7746
WBS 2.3.1.4 Tier-1 Operations & Monitoring - Ivan
- Occupancy: 92%, A/R: 100%
- Occupancy is lower than expected due to:
- 2/5/25: Site was emptied for several hours due to Harvester DB lock timeouts.
- 2/1/25: The problem mention in the storage section above