RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:30 13:31
      Experiment Operational Issues 1m
    • 13:35 13:45
      VO-Liaison ATLAS 10m
      Speakers: Brij Kishor Jashal (TIFR, RAL, IFIC), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:45 13:55
      VO Liaison CMS 10m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      Reported the access of Premix library files from Echo (files are present due to multihop to Antares).

      Discussion started on whether CMS want to use the RAL-FTS again. (Relating to the request from CMS to upgrade RAL-FTS for tokens). Current thinking is to create a more coherent strategy around CERN and FNAL being the backup for each other, but not necessarily to use RAL.

      SAM tests for CE/tokens still not appearing consistently. Believe this is still a problem on the test side, Tom..?

      Job performance similar to other Tier 1s.

      Mini-DC for UK Tier 2s all this week: Writes to Tier 2 sites at DC24 levels worked well. Pushing up the rates to Brunel show a limitation somewhere.

    • 13:55 14:05
      VO Liaison LHCb 10m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      News: 

      • Heavy Ion data has been distributed, RAL got 750TB out of ~2.3PB
      • That brings RAL tape usage to ~90% (for the extended allocation).



      Operational issues:

      • Upload failures on RAL (WNs) -> CERN channel
        • no updates
      • Issues on 4 WNs last Sunday
        • 4 WNs became problematic, transfers to/from these WNs were failing
          • might be some limit excess
          • Difficult to investigate without reproducing the issue
          • It has not repeated (yet) since its first occurence.
    • 14:10 14:20
      VO Liaison ALICE 10m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 14:20 14:30
      VO Liaison LSST 10m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
      • Merry Christmas! (My and Mat's last liaison meeting this year)
      • gsissh access to the LSST VOBox has now been confirmed by Peter Love 
      • Message topic created for RAL, this is now being listened to for the data being sent to RAL for ingestion
      • Migration of the LSST Butler Database from the VOBox to a Database Service Database - data seems to be intact, however, there appears to be some connection issues between the BatchFarm (BF) and the Database
        psycopg2.OperationalError: connection to server at "dbspgha03.fds.rl.ac.uk" (130.246.184.29), port 5432 failed: could not receive data from server: Connection timed out

        with timeout errors (with quite a long timeout and multiple retried) are holding jobs open for long periods, causing pile-up of the LSST test jobs.
        https://vande.gridpp.rl.ac.uk/next/d/-F03nhnMk/batch-single-vo?orgId=1&var-VO=lsst&var-rp=1_month&var-prefix=mean_&from=now-24h&to=now&refresh=15m
      • Connection between the BF and the DB has been confirmed from nodes to DB through various methods (ping, NC) - further investigation ongoing.
      • In Jan will be starting to create an LSST testing suite for our monitoring of the job submission pipelines
    • 14:30 14:40
      VO Liaison APEL 10m
      Speaker: Thomas Dack
    • 14:45 14:55
      WP-D - GPU, Data Management, Other 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore
    • 15:00 15:01
      Major Incidents Changes 1m
    • 15:05 15:15
      Summary of Operational Status and Issues 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore
    • 15:20 15:21
      AOB 1m
    • 15:22 15:32
      Any other Business 10m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore