RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:00 13:01
      Major Incidents Changes 1m
    • 13:01 13:02
      Summary of Operational Status and Issues 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))
    • 13:02 13:03
      GGUS /RT Tickets 1m

      https://tinyurl.com/T1-GGUS-Open
      https://tinyurl.com/T1-GGUS-Closed

    • 13:04 13:05
      Site Availability 1m

      https://lcgwww.gridpp.rl.ac.uk/utils/availchart/

      https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL

      http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden

    • 13:05 13:06
      Experiment Operational Issues 1m
    • 13:15 13:16
      VO Liaison CMS 1m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      DNS issues more serious than ever. 7am Tuesday until now there were some gridftp failures and a mixture of webdav failures and blank tests. 

      New SAM tests have started to appear...this is a particularly bad time for this to happen. The new tests are not yet contributing to the site status. Several of them are failing despite efforts in the last weeks to prepare for them - mostly extra requirements relating to Echo as an object store. The changes requested for these have now been rolled out on all of the gateways as of today. https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=480520&results=33c566c2c67a948ee5b93ccb14d96eaa

      There are also new SAM tests for xrootd/AAA endpoints which are all green except one for token support. NB...all failing 'token' type tests are expected to fail...I see these are failing at every Tier 1. 

      Jobs: still running ok. Some failure spikes but mainly in line with other T1s. Efficiency is a little bit up and down during the last days, but we are doing ok. 

      Looking to change the FTS config (all instances) from default to a couple of hundred min/max active transfers. Will copy the ATLAS numbers. 

       

    • 13:16 13:17
      VO-Liaison ATLAS 1m
      Speakers: James William Walder (Science and Technology Facilities Council STFC (GB)), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:20 13:21
      VO Liaison LHCb 1m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
      1. Network issues significantly affected LHCb transfers last Wednesday and yesterday.
      2. Last Wednesday LHCb updated their xrootd client, it is now build against openssl 3 and incompatible with xrootd server v5.3.3 which is used on RAL's WNs
        • changes were reverted last Friday
        • Urgent update was requested
          • Corresponding sandbox was merged today
      3. Dark and Lost data on Antares were found.
        • Several hundred files
        • Files seem to be lost before migration CASTOR -> Antares
      4. Vector read issue.
        • Tests are ongoing, looking good so far
          • ~200 successfull user jobs on the test WNs, not a single failure due to vector read
      5. Problems with accessing one file simultaneously.
        • Glasgow patch is to be applied to RAL's gateways, sandbox is ready.
      6. Problems with IPv6 connectivity for LHCb VO-box\
        • Firewall changes requested.

       

    • 13:25 13:28
      VO Liaison LSST 3m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
    • 13:30 13:31
      VO Liaison Others 1m
    • 13:31 13:32
      AOB 1m
    • 13:32 13:33
      Any other Business 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))