RAL Tier1 Experiments Liaison Meeting
Access Grid
RAL R89
-
-
13:00
Major Incidents Changes
-
1
Summary of Operational Status and IssuesSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB)), Kieran Howlett (STFC RAL)
-
2
GGUS /RT Tickets
https://tinyurl.com/T1-GGUS-Open
https://tinyurl.com/T1-GGUS-Closed -
3
Site Availability
https://lcgwww.gridpp.rl.ac.uk/utils/availchart/
https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden
-
13:05
Experiment Operational Issues
-
4
VO Liaison CMSSpeaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
CMS using up to 12k cores - performance looks good. WN network usage spiked at 50G.
Tested xrootd endpoints under redirector - all seems to be working well.
CMS still not using rdr for job uploads from WNs...or webdav. Sticking with gsiftp until gateways etc. look more stable. More gateways expected..iminently.
Memory allocation type errors prompted Tom Birkett to increase the amount of memory available to the xrootd-proxy container on WNs.
Running some CMS/UK pre-tests for DC24. Waiting for new gateways at RAL before doing T1..?
-
5
VO-Liaison ATLASSpeakers: James William Walder (Science and Technology Facilities Council STFC (GB)), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
- RAL-LCG2 staging Issue “File not found” (https://ggus.eu/index.php?mode=ticket_info&ticket_id=162827); On hold:
- https://its.cern.ch/jira/browse/EOS-5771
- Actions from the antares team:
- A thread was opened in the EOS forum - https://eos-community.web.cern.ch/t/new-citrine-release-4-8-104/892/2 requesting a new EOS citrine version, 4.8.104, that includes the gitlab commit (https://gitlab.cern.ch/dss/eos/-/commit/7c04411e6d6036ae6854dfdf9b0ab4eb319dcdd7) which fixes the issue.
- Waiting for the CTA project leader and the EOSCTA service manager to be back in office -- ~ two weeks.
- Once the new EOS citrine version, 4.8.104, is out it can be pushed to Antares (following some minimum functional testing).
- Rucio was broken yesterday for some time (~10:05-11:35 hrs).
- HammerCloud test failures.
- e.g., http://bigpanda.cern.ch/job?pandaid=5932541395 ==> modificationHost: slot1_2@lcg2662.gridpp.rl.ac.uk pilot:::1099 Failed to stage-in file: mc15_13TeV:EVNT.04972714._000037.pool.root.1 from RAL-LCG2-ECHO_DATADISK, '<' not supported between instances of 'int' and 'NoneType'")]:failed to transfer files using copytools=['rucio']
- ATLAS jobs failing with: "failed to close file descriptor: bad file descriptor": https://stfc.atlassian.net/browse/GS-131
- Propose to create a GGUS so that any progress can be more visibly tracked and discussed.
- When are we going to have the discussion @Tom Birkett?
- atlas:test file dumps: https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=485639 [since 12 July 2023].
- Changes made on 10 August 2023.
- No new dumps yet @gfal-stat davs://webdav.echo.stfc.ac.uk:1094/atlas:test/dumps/dump_yyyymmdd!
- RAL-LCG2 staging Issue “File not found” (https://ggus.eu/index.php?mode=ticket_info&ticket_id=162827); On hold:
-
6
VO Liaison LHCbSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
-
7
VO Liaison LSSTSpeaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
-
8
VO Liaison Others
-
13:31
AOB
-
9
Any other BusinessSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))
-
13:00