Resulted in failed transfers/job uploads/downloads, and, most annoyingly, in file loss (GGUS 683184)
Due to race conditions between https and root protocols
There should be a way to mitigate this, e.g.
Poison DNS entry for webdav.echo.stfc.ac.uk on WNs (for LHCb jobs at least), so that it is redirected to local gateway as well
Introduce some locking/protection (e.g. do not execute delete if it arrived more than N minutes ago)?
4
ALICE Operations Report
Speaker:
Alexander Rogovskiy(Rutherford Appleton Laboratory)
5
LSST Operations Report
Speakers:
Mathew Sims, Timothy Noble(Science and Technology Facilities Council STFC (GB))
LSST jobs from latest pipeline failing,
We think currently this is due to the job pulling in the instructions (QG), writing out to echo any changes, then reading locally and complaining its changes were not there. - Contacting Middleware team about this.
Moving data to 'correct' location on echo, started on Friday and moved 220,000 files so far with originals and copies checksumed and if they matched old one deleted.
14:00
Tier-1 Projects
6
Anatares Upgrade
New EOS nodes
Tape Robotics downtime
Speakers:
George Patargias, Thomas Byrne
7
XRootD Development
Speakers:
Alexander Rogovskiy(Rutherford Appleton Laboratory), Jyothish Thomas(STFC)
8
Utilizing GPUs
Speakers:
Jyoti Prakash Biswal(Rutherford Appleton Laboratory), Thomas Birkett
14:45
AOB
9
Summary of Operational Status and Issues
Speakers:
Brian Davies(Lancaster University (GB)), Darren Moore