RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:30 13:31
      Experiment Operational Issues 1m
    • 13:35 13:40
      ATLAS Operations Report 5m
      Speakers: Brij Kishor Jashal (Rutherford Appelton Laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:40 13:45
      CMS Operations Report 5m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      Monday - issues with Echo due to busy-ness of the system rebalancing new hardware. Red day for SAM tests because of this. 

      Some periods failing SAM tests for AAA machines in the last 2 weeks. Each time Katy did nothing and they 'fixed themselves'. 

      Antares downtime today for CTA version upgrade. New EOS front-end nodes now fully installed and in prod. These are apparently dual-stack, but I'm not seeing any green yet on the 'connection' SAM test that checks for this. Are any of the tests using the new nodes - to be checked. 

    • 13:45 13:50
      LHCb Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      Operational issues:

      • Upload and download failures to ECHO (GGUS:)
        • Caused by ECHO issues, looks much better now
          • There was a small spike of failed uploads this morning, which may be related to a different issue
      • LHCb drained yesterday.
        • Lack of jobs, not our fault.
      • Faulty LHCbDIRAC release deployed this morning.
        • May cause drain again.
    • 13:50 13:55
      ALICE Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      Get test fro antares preprod is failing due to lack of free space (?). Does not affect production (since it is a preprod instance), but still interesting.

    • 13:55 14:00
      LSST Operations Report 5m
      Speakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))
      • Investigating job slowness compared to other sites in LSST (RAL 3:10:12,  IN2P3 1:17:53, LANCS 1:39:05)
        • IO from echo?
          • Current thinking from LSST is that reading files from echo takes twice as long as other sites
            • 0.6s per file at RAL, over 0.3s at others. Not much but over 20,000 files adds up! (And this is a small test data set not a real one)
              • LANCS mentioned today in the storage meeting they are creating a pool just for LSST with 4+2 to try and improve performance for LSST
        • CPU / memory bound?
      • Request to all the sites if there is a shared space at their site to create a sqLite database within a DAG (job of jobs) to track progress within the DAG
        • Informed requester of infrastructure at RAL and they seem happy that the local scrath for jobs and ECHO would be fine
      • Will begin movement of ComCam data to RAL this week 
    • 14:00 14:01
      Tier-1 Projects 1m
    • 14:15 14:25
      Anatares Upgrade 10m

      New EOS nodes
      Repack Progress

      Speakers: George Patargias, Thomas Byrne
    • 14:25 14:35
      XRootD Development 10m
      Speakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)

      error 500 issue narrowed down to Glasgow ipv6 routing issues. scitag forced the transfers to ipv6, hence why they were failing. WIP Glasgow side to resolve this. lcgfts3 has been configured to disable ipv6 for that site and could be used as an emergency measure if needed.

      xrootd 5.8.4 is out, no significant improvements for RAL needing urgent update

    • 14:35 14:45
      Utilizing GPUs 10m
      Speakers: Jyoti Prakash Biswal (Rutherford Appleton Laboratory), Thomas Birkett
    • 14:45 14:50
      SSD Storage Evaluation 5m
    • 14:50 14:55
      Echo deployment 5m
    • 15:00 15:01
      AOB 1m
    • 15:01 15:10
      Summary of Operational Status and Issues 9m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore
    • 15:10 15:15
      Any other Business 5m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore