RAL Tier1 Experiments Liaison Meeting

Name: RAL Tier1 Experiments Liaison Meeting
Start: 2021-04-07T13:30:00+01:00
End: 2021-04-07T14:30:00+01:00
Location: RAL R89

Wednesday 7 Apr 2021, 13:30 → 14:30 Europe/London

Access Grid (RAL R89)

Access Grid

RAL R89

Description

Please attend via the following Zoom meeting:

https://ukri.zoom.us/j/98562731547?pwd=UU9Wb2xCL05tWmROT1h6SUlWdUJ3dz09

- 13:38 → 13:39
  
  Major Incidents Changes 1m
- 13:39 → 13:40
  
  Summary of Operational Status and Issues 1m
  
  Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))
- 13:40 → 13:41
  
  GGUS /RT Tickets 1m
  
  https://tinyurl.com/T1-GGUS-Open
  https://tinyurl.com/T1-GGUS-Closed
- 13:41 → 13:42
  
  Site Availability 1m
  
  https://lcgwww.gridpp.rl.ac.uk/utils/availchart/
  
  https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
  
  http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden
- 13:42 → 13:43
  
  Experiment Operational Issues 1m
- 13:44 → 13:45
  
  VO-Liaison ATLAS 1m
  
  Minutes
  
  Speakers: James William Walder (Science and Technology Facilities Council STFC (GB)), Dr Tim Adye (Science and Technology Facilities Council STFC (GB))
  
  Updated Echo allocations for FY 21/22
  - https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=406939
  - Looks like no issues; will update ATLAS space json today.
  
  GGUS-Ticket-ID: #151098 "IN PROGRESS" "NGI_UK" "High failure rate at RAL-LCG2_TEST"
  * Possible that interaction between Docker and pilot causes some unexpected termination of docker.
  i.e after Job 1, pilot tries to remove any orphaned processes with kill signal.
  might be killing 'something' that terminates docker job (HTCondor receives a ExitReason = “died on signal 9 (Killed)”)
  - If confirmed , ... ?
- 13:46 → 13:47
  
  VO Liaison CMS 1m
  
  Minutes
  
  Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
  
  Still running 3k cores. Hoping to increase that number this week. Job failures are ok and efficiency is 40-60% this week.
  
  SAM tests look better...fewer 'missing' tests. ARC-CE01 had no test results for 24 hours after a reboot of that machine, but a second reboot seems to have fixed that.
  
  Talked to James A during the meeting and he agreed to increase the number of CMS jobs running on the newest software (Dell19 tranche). This may have been reduced in the recent past (since end-Feb) due to single-core jobs taking over, and CMS jobs only run multicore.
- 13:48 → 13:49
  
  VO Liaison LHCb 1m
  
  Speaker: Raja Nandakumar (Science and Technology Facilities Council STFC (GB))
- 13:52 → 13:53
  
  VO Liaison Others 1m
- 13:53 → 13:54
  
  Experiment Planning 1m
- 13:54 → 13:55
  
  Dune/protoDune 1m
- 13:55 → 13:56
  
  Euclid 1m
- 13:56 → 13:57
  
  SKA 1m
- 13:57 → 13:58
  
  AOB 1m
- 13:58 → 13:59
  
  Any other Business 1m
  
  Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))

Choose timezone

RAL Tier1 Experiments Liaison Meeting

Access Grid

RAL R89

Please attend via the following Zoom meeting:

https://ukri.zoom.us/j/98562731547?pwd=UU9Wb2xCL05tWmROT1h6SUlWdUJ3dz09

Share this page

Direct link

Social networks

Calendaring