RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 1
      Site Operations
    • 13:34
      Experiment Operational Issues
    • 2
      ATLAS Operations Report
      Speakers: Brij Kishor Jashal (Rutherford Appleton Laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 3
      CMS Operations Report
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      A quiet week.

      Still some intermittent failures on the CEs (missing tests, "execution timeout"), a cluster of them this morning on several CEs (ignoring CE01 which is being updated and in DT). 

      A small cluster of failures on MC Production jobs on 23rd Feb, 96% of failures attributed to exit code 81:

      • 81 - Job did not find functioning CMSSW on worker node.
    • 4
      LHCb Operations Report
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      News: 

      • LHCb can now submit production arm jobs
        • Glasgow is already used. Are we interested?

       

      Operational issues:

      • Problems with sandbox store at CERN last Wednesday
        • Resulted in a spike of Rescheduled jobs
      • A couple of buggy productions submitted last week.
      • Transfer failures from Manchester and RHUL to RAL GGUS:1001702
        • it sometimes takes a 10-15 seconds to establish an SSL connection to ECHO gateways from Manchester WNs via IPv6
        • Investigation is ongoing
    • 5
      ALICE Operations Report
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 6
      LSST Operations Report
      Speakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))
      • Jobs submitted to RAL stuck in 'Activated' state, this is due to CE01 being in down time, and only one CE being configured for some LSST Panda queues - I have requested this be fixed
      • Some issues with IPv6 Ranges on S3 bucket policy prevented some jobs from being run, now resolved with inclusion of IPs in policy
      • Near 6TB moved to RAL totalling 694 TB of 10133TB (17% of pledge or 7% of Echo reported allocation) - only concerned as LANCS has reported outputs from a dp2_prep run filled around 1.5 TB
    • 14:00
      Tier-1 Projects
    • 7
      Anatares Upgrade
      Speakers: George Patargias, Thomas Byrne
    • 8
      XRootD Development
      Speakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)

      new script for filedumps.
      if the VOs requested formats could be added
      https://stfc.atlassian.net/wiki/spaces/X/pages/1116340236/VO+filedump+formats

    • 9
      Utilizing GPUs
      Speakers: Jyoti Prakash Biswal (Rutherford Appleton Laboratory), Thomas Birkett
    • 10
      SSD Storage Evaluation
    • 14:25
      AOB
    • 11
      Summary of Operational Status and Issues
      Speakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore
    • 12
      Any other Business
      Speakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore