- Added a reserved queue to the HTCondor batch system. It Currently consists of 1 node and would allow a analysis team to run whole node jobs for the MadGraph application.
- Working on taming the ceph file system. It has some stability issues that warrant more investigations. Very high mem usage on mds is observed during incidences.
- volume mounts are monintored and alerted on both HTCondor workers and the interactive login nodes.
- will work on rook-ceph upgrade. Had some trouble last time due to some K8s deprications. It appears at least the newer ceph version(v17) would have a alertable metric(slow mds ops) that we usually observed during incidences.
- will also update os.
- AnalysisBase image has been updated to latest: 24.2.37. All the libraries have been updated. Now setting dev version of uproot with a lot of fixes for reading physlite data files.