RAL Tier1 Experiments Liaison Meeting
Access Grid
RAL R89
-
-
13:00
→
13:01
Major Incidents Changes 1m
-
13:01
→
13:02
Summary of Operational Status and Issues 1mSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))
-
13:02
→
13:03
GGUS /RT Tickets 1m
https://tinyurl.com/T1-GGUS-Open
https://tinyurl.com/T1-GGUS-Closed -
13:04
→
13:05
Site Availability 1m
https://lcgwww.gridpp.rl.ac.uk/utils/availchart/
https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden
-
13:05
→
13:06
Experiment Operational Issues 1m
-
13:15
→
13:16
VO Liaison CMS 1mSpeaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
DNS issues more serious than ever. 7am Tuesday until now there were some gridftp failures and a mixture of webdav failures and blank tests.
New SAM tests have started to appear...this is a particularly bad time for this to happen. The new tests are not yet contributing to the site status. Several of them are failing despite efforts in the last weeks to prepare for them - mostly extra requirements relating to Echo as an object store. The changes requested for these have now been rolled out on all of the gateways as of today. https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=480520&results=33c566c2c67a948ee5b93ccb14d96eaa
There are also new SAM tests for xrootd/AAA endpoints which are all green except one for token support. NB...all failing 'token' type tests are expected to fail...I see these are failing at every Tier 1.
Jobs: still running ok. Some failure spikes but mainly in line with other T1s. Efficiency is a little bit up and down during the last days, but we are doing ok.
Looking to change the FTS config (all instances) from default to a couple of hundred min/max active transfers. Will copy the ATLAS numbers.
-
13:16
→
13:17
VO-Liaison ATLAS 1mSpeakers: James William Walder (Science and Technology Facilities Council STFC (GB)), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
-
13:20
→
13:21
VO Liaison LHCb 1mSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
- Network issues significantly affected LHCb transfers last Wednesday and yesterday.
- Last Wednesday LHCb updated their xrootd client, it is now build against openssl 3 and incompatible with xrootd server v5.3.3 which is used on RAL's WNs
- changes were reverted last Friday
- Urgent update was requested
- Corresponding sandbox was merged today
- Dark and Lost data on Antares were found.
- Several hundred files
- Files seem to be lost before migration CASTOR -> Antares
- Vector read issue.
- Tests are ongoing, looking good so far
- ~200 successfull user jobs on the test WNs, not a single failure due to vector read
- Tests are ongoing, looking good so far
- Problems with accessing one file simultaneously.
- Glasgow patch is to be applied to RAL's gateways, sandbox is ready.
- Problems with IPv6 connectivity for LHCb VO-box\
- Firewall changes requested.
-
13:25
→
13:28
VO Liaison LSST 3mSpeaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
-
13:30
→
13:31
VO Liaison Others 1m
-
13:31
→
13:32
AOB 1m
-
13:32
→
13:33
Any other Business 1mSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))
-
13:00
→
13:01