RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:30 13:34
      Site Operations 4m
    • 13:34 13:35
      Experiment Operational Issues 1m
    • 13:35 13:40
      ATLAS Operations Report 5m
      Speakers: Brij Kishor Jashal (Rutherford Appleton Laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:40 13:45
      CMS Operations Report 5m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      A quiet week.

      Still some intermittent failures on the CEs (missing tests, "execution timeout"), a cluster of them this morning on several CEs (ignoring CE01 which is being updated and in DT). 

      A small cluster of failures on MC Production jobs on 23rd Feb, 96% of failures attributed to exit code 81:

      • 81 - Job did not find functioning CMSSW on worker node.
    • 13:45 13:50
      LHCb Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      News: 

      • LHCb can now submit production arm jobs
        • Glasgow is already used. Are we interested?

       

      Operational issues:

      • Problems with sandbox store at CERN last Wednesday
        • Resulted in a spike of Rescheduled jobs
      • A couple of buggy productions submitted last week.
      • Transfer failures from Manchester and RHUL to RAL GGUS:1001702
        • it sometimes takes a 10-15 seconds to establish an SSL connection to ECHO gateways from Manchester WNs via IPv6
        • Investigation is ongoing
    • 13:50 13:55
      ALICE Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 13:55 14:00
      LSST Operations Report 5m
      Speakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))
      • Jobs submitted to RAL stuck in 'Activated' state, this is due to CE01 being in down time, and only one CE being configured for some LSST Panda queues - I have requested this be fixed
      • Some issues with IPv6 Ranges on S3 bucket policy prevented some jobs from being run, now resolved with inclusion of IPs in policy
      • Near 6TB moved to RAL totalling 694 TB of 10133TB (17% of pledge or 7% of Echo reported allocation) - only concerned as LANCS has reported outputs from a dp2_prep run filled around 1.5 TB
    • 14:00 14:01
      Tier-1 Projects 1m
    • 14:02 14:07
      Anatares Upgrade 5m
      Speakers: George Patargias, Thomas Byrne
    • 14:08 14:13
      XRootD Development 5m
      Speakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)

      new script for filedumps.
      if the VOs requested formats could be added
      https://stfc.atlassian.net/wiki/spaces/X/pages/1116340236/VO+filedump+formats

    • 14:14 14:19
      Utilizing GPUs 5m
      Speakers: Jyoti Prakash Biswal (Rutherford Appleton Laboratory), Thomas Birkett
    • 14:20 14:25
      SSD Storage Evaluation 5m
    • 14:25 14:26
      AOB 1m
    • 14:27 14:36
      Summary of Operational Status and Issues 9m
      Speakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore
    • 14:45 14:50
      Any other Business 5m
      Speakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore