RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:30 13:31
      Experiment Operational Issues 1m
    • 13:35 13:40
      ATLAS Operations Report 5m
      Speakers: Brij Kishor Jashal (Rutherford appelton laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:40 13:45
      CMS Operations Report 5m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      Various network issues over the weekend and continuing today. DI are addressing it with updates this week. SAM test failures are all due to this. 

      Job performance has been fine. 

      Requested CMS to remove RAL FTS entirely (was running for some little T3s and a handful of others, mostly as secondary FTS choice). The reason for this is that our FTS is not on the LHCONE, therefore it is unable to contact CERN CTA tape. 

    • 13:45 13:50
      LHCb Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      Operational issues:

      • Network outages (GGUS 683080, closed)
        • Affected LHCb on the 19th, 21st and last night
        • Some interventions are planned for today as well.
      • Some issues with the new LHCbDIRAC are still present, that affects production as well (mostly jobs, see the plots attached).


      Off-topic: how to raise an alarm during the out of office hours? Does opsgenie still work as expected? 

    • 13:50 13:55
      ALICE Operations Report 5m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 13:55 14:00
      LSST Operations Report 5m
      Speakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))

      LSST jobs were not running well at RAL, specifically jobs that were creating and reading the Quantum Graph (Pipeline instructions). This prompted a review of the code on the LSST side
      and it was noted on a local file system this read was accessing the QG over 100,000 times during a job.

      This has now been patched LSST side and was to be tested last night.

       

      LSST jobs have had a few drop offs over the last week with jobs being cancelled including yesterday 16:15 and was not resolved fully until around 10:30 this morning, but had jobs starting again at 03:45

       

      Final jobs at RAL are currently not working, but I have organised a meeting with CM team to resolve the issues and understand their FinalStep process.

       

       

    • 14:00 14:01
      Tier-1 Projects 1m
    • 14:15 14:25
      Anatares Upgrade 10m

      New EOS nodes
      Tape Robotics downtime

      Speakers: George Patargias, Thomas Byrne
    • 14:25 14:35
      XRootD Development 10m
      Speakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)

      mainline (xrootd.echo.stfc.ac.uk) gateways have enabled scitag packet marking

      network issues affected traffic across all 3 days

    • 14:35 14:45
      Varnish For ATLAS 10m
      Speakers: Alastair Dewhurst (Science and Technology Facilities Council STFC (GB)), Brij Kishor Jashal (Rutherford appelton laboratory)
    • 14:45 14:46
      AOB 1m
    • 14:46 14:55
      Summary of Operational Status and Issues 9m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore
    • 14:55 15:00
      Any other Business 5m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore