RAL Tier1 Experiments Liaison Meeting
→
Europe/London
Access Grid (RAL R89)
Access Grid
RAL R89
-
-
13:30
Experiment Operational Issues
-
1
ATLAS Operations ReportSpeakers: Brij Kishor Jashal (Rutherford appelton laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
-
2
CMS Operations ReportSpeaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
Apologies from Katy - attending the OTF.
Good SAM tests this week. A short period of failures Tuesday evening on Antares, due to filling of buffer. There was a big spike in both write successes and write failures - CMS still hit a very high throughput to the buffer and files were quickly archived.
SAM test warnings fixed for the 'squid' tests - thanks Alex for tracking this!
Still seeing the 'basic' token test in warning when it lands on 18/19 WN tranches (currently around 50% of the time) - this is due to missing IPv6 on those nodes, but CMS seems happy to keep the test in warning.
Good performance for CMS jobs - best CPU efficiency among T1s for this week again!
-
3
LHCb Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
News:
- LHCb is going to run some test MC productions on ARM
- at CERN and Glasgow
Operational issues:
- Jobs failed to get files from Echo via local gateways (GGUS 683524).
- connectivity issue between 201[89] gens and new storage node, fixed
- Last Friday a slow OSD also contributed to this issue
- It was removed from the cluster, that fixed the issue
- Failed uploads to ECHO last Friday (GGUS 683588).
- One of the gateways became problematic due to stuck connections. It was fixed by restarting the gateway. Ticket closed.
- LHCb is going to run some test MC productions on ARM
-
4
ALICE Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
-
5
LSST Operations ReportSpeakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))
- First Look event on Monday 23rd June
- https://www.bbc.co.uk/news/articles/cj3rmjjgx6xo
- https://www.youtube.com/live/dF1g-Ru8mjM
- https://rubinobservatory.org/news/first-imagery-rubin
- https://rubinobservatory.org/explore/citizen-science
- Should be de-embargoing data now, so more data to move to the UK moving forwards
- RC2 runs have not been succeeding due to no outputs from jobs after 2 hours
- Think this may boil down to a similar issue we saw before with the Quantum Graph issue (if not exactly the same issue)
- not pulling data into WN, just remote reading small portions over and over and over (peak was over 250,000 times for one job)
- Coordinating with Jyothish for the XRootD monitoring that died to investigate file by file access patterns
- Think this may boil down to a similar issue we saw before with the Quantum Graph issue (if not exactly the same issue)
- First Look event on Monday 23rd June
-
14:00
Tier-1 Projects
-
6
Anatares Upgrade
New EOS nodes
Repack ProgressSpeakers: George Patargias, Thomas Byrne - 7
-
8
Utilizing GPUsSpeakers: Jyoti Prakash Biswal (Rutherford Appleton Laboratory), Thomas Birkett
-
14:45
AOB
-
9
Summary of Operational Status and IssuesSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore
-
10
Any other BusinessSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore
-
13:30