RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:30 13:31
      Experiment Operational Issues 1m
    • 13:35 13:45
      VO-Liaison ATLAS 10m
      Speakers: Brij Kishor Jashal (TIFR, RAL, IFIC), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:45 13:55
      VO Liaison CMS 10m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      Katy missed the last 3 weeks - O&C week, CHEP and off sick.

      SAM tests look good, other than the previously noted occasional irregular failure on the CE tests. AAA saw some failures in the middle of last week probably related to reboots made a few days before - typical behaviour. Service restarts in various order appeared to fix it.

      Job performance: A week ago there were low performance jobs running at many sites - these were spotted and supposedly fixed. In the last 24 hours I see again low CPU efficiency at several T1s. Seems lots of T1s are running large amounts of Analysis jobs which are pulling the CPU eff down.

      Mini-UK data challenge part 1 planned for week of 9th Dec (at least ATLAS and CMS). Intention is to mostly test Tier2s. Tier 1 will likely be used as a source, but the idea is not to push Echo significantly at this point. 

    • 13:55 14:05
      VO Liaison LHCb 10m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 14:10 14:20
      VO Liaison ALICE 10m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 14:20 14:30
      VO Liaison LSST 10m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))

      LSST limit currently set at 10. Checked with Wen Guan, and they are testing the functionality of not submitting pilots if there is nothing in the queue on the Dev PanDA now. When deployed on plod can change the limit and track behaviour.

       

      Finalising of the messaging stack to enable RAL to be integrated into the site tests.

       

      Ticket in with Grid Services to add Peter Love and Jennifer Adelman-McCarthy (CMS / LSST) to the UIs to allow them to access the LSST VO box, still waiting on the ticket to be addressed, as bounced between Jose, Tom, and James.

       

       Talking to DB team on 15th Nov about the scale and needs for the LSST Data Butler (10-100TB)

    • 14:30 14:40
      VO Liaison APEL 10m
      Speaker: Thomas Dack
    • 14:45 14:55
      WP-D - GPU, Data Management, Other 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore
    • 15:00 15:01
      Major Incidents Changes 1m
    • 15:05 15:15
      Summary of Operational Status and Issues 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore

      Callouts over weekend:

      No call-outs over the weekend

      Callouts:

      There was one Echo call-out on the 28/10/24v: echo-manager01.gridpp.rl.ac.uk/ha-check_ceph_xrootd_webdav_service_callout. This was as the result of a load spike of FTS requests from Atlas.

      Antares

      Nothing of note to report.

      Echo

      The Echo team are continuing with their Nautilus → Pacific upgrade. To point, there have been no issues reported. Details of this intervention can be found at https://stfc.atlassian.net/wiki/pages/resumedraft.action?draftId=605061121

      The tracking ticket for this intervention is here:

      Batch Farm

      Nothing of note to report

      General 

      The UK ARGUS service supported at the Tier1 at RAL will be decommissioned on the 2nd December 2024. This is after CERN’s notification of deprecating their instance back in July (https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMeetingWeek240729). 

      After this date, the ARGUS service hosted at argusngi.gridpp.rl.ac.uk will be unreachable.

      If sites are utilising their ARGUS instances for user mappings, we recommend migrating to native solutions found in Nordugrid ARC-CE, HTCondor-CE or similar.
      Any questions or concerns, please inform the RAL Tier1 Grid Services team.

      GGUS 

      6 of 6 Tickets
      Ticket-ID Type VO Site Priority Resp. Unit Status Last Update Subject Scope
      168905 lhcb RAL-LCG2 less urgent NGI_UK in progress 2024-11-06 Access to the vobox WLCG
      168888   cms RAL-LCG2 urgent NGI_UK involved on hold 2024-11-05 Request for Dual Stack Support on ... WLCG
      167369   cms RAL-LCG2 urgent NGI_UK involved in progress 2024-10-14 CVMFS stratum-1 server at RAL WLCG
      165969   cms RAL-LCG2 urgent NGI_UK on hold 2024-10-14 enabling IAM token access to ARC-CEs at ... WLCG
      165818   none RAL-LCG2 urgent NGI_UK in progress 2024-09-02 ATTENTION - new TOKEN configuration for ... EGI
      165323   none RAL-LCG2 less urgent NGI_UK on hold 2024-11-04 RAL-LCG2: Enabling tokens on ... EGI

       

       

       

    • 15:20 15:21
      AOB 1m
    • 15:22 15:32
      Any other Business 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore