RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 13:00 13:01
      Major Incidents Changes 1m
    • 13:01 13:02
      Summary of Operational Status and Issues 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB)), Kieran Howlett (STFC RAL)
    • 13:02 13:03
      GGUS /RT Tickets 1m

      https://tinyurl.com/T1-GGUS-Open
      https://tinyurl.com/T1-GGUS-Closed

    • 13:04 13:05
      Site Availability 1m

      https://lcgwww.gridpp.rl.ac.uk/utils/availchart/

      https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL

      http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden

    • 13:05 13:06
      Experiment Operational Issues 1m
    • 13:15 13:16
      VO Liaison CMS 1m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      Katy is on holiday.

      SAM tests failing on gsiftp tests, which is very annoying because we don't use gsiftp in production anymore. Katy was planning to remove the test but it proved useful during periods of DNS failure. Now the test is failing intermittently on the 'write' test. The error is:

      Error in transfer, clean destination file gsiftp://gridftp.echo.stfc.ac.uk:2811/cms:/store/unmerged/SAM/se_gsiftp_20230321_145709_etf-28.cpy
      Copy check of se_gsiftp_20230321_145709_etf-28.wrt at gridftp.echo.stfc.ac.uk:2811 failed: TRANSFER globus_ftp_client: the server responded with an error 500 500-Command failed. : globus_xio: Unable to connect to 130.246.176.246:52791 500-globus_xio: System error in connect: No route to host 500-globus_xio: A system call failed: No route to host 500 End.

      Is this DNS? Does anyone want to investigate? CMS is currently in drain because of this, and it's tempting to 'fix' by removing the gsiftp endpoint for CMS.

      New Antares tests: George found a way to keep the 3 'read test' files permanently on buffer; still to make the 'open access' test green.

      New token tests (do not contribute to SAM status yet): JW made them pass on ceph-dev-gw4...this is very good news.

      20/21 tranches of WNs being planned to move to LHCONE today. 

    • 13:16 13:17
      VO-Liaison ATLAS 1m
      Speakers: James William Walder (Science and Technology Facilities Council STFC (GB)), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 13:20 13:21
      VO Liaison LHCb 1m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
      • GGUS tickets
        • Ticket about low number of jobs
          • Due to batch farm drain
          • Would it be possible to have a warning DT for such operations next time?
        •  
        • Slow stat/checksum calls
          • Patch was applied on the test GW, it is being tested
        • Vector read
          • Tests are still successfull: no vector read failures so far
          • We have ~1100 successfull user jobs executed on test WNs
          • Code review is ongoing
      • Operational issues
        • There were a lot of failed uploads yesterday evening, seems like it is due to gateway issues
        • Corrupted file(s) was found
    • 13:25 13:28
      VO Liaison LSST 3m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
    • 13:30 13:31
      VO Liaison Others 1m
    • 13:31 13:32
      AOB 1m
    • 13:32 13:33
      Any other Business 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore (Science and Technology Facilities Council STFC (GB))