RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 14:00 14:01
      Major Incidents Changes 1m
    • 14:05 14:06
      Summary of Operational Status and Issues 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore, Kieran Howlett (STFC RAL)
    • 14:10 14:11
      Experiment Operational Issues 1m
    • 14:15 14:16
      VO Liaison CMS 1m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      As mentioned last week, cap on CMS running cores was removed. In the last few days CMS has taken over the farm! However, in the last day or more the farm utilisation is less than 100%...so I guess this is exactly how it should work. Other VOs should send us more work! It looks like ATLAS has pending jobs but their running cores is still low.

      With respect to the number of jobs running...performance is very bad (failures and efficiency). Some issues (failures) may be coming from the CMS side. However, the LHCOPN 100G link is currently saturated and I think this is the problem with the efficiency. 

      New SAM tests for webdav protocol on Antares, following the introduction of the Tape REST API were failing. For some reason the webdav test suite is different to the root test suite, which of course has been green for some time. The new list of tests included a PROPFIND test on the /store/ directory - this was the failing one. George added the DN to give permissions to read and the test is green since Saturday. 

      We are also following up the load tests which have been failing for tape for many months - probably since they started.

      A couple of issues with Echo storage relating to new version of XRootD - Jyothish rolled back and did some other fixes. See also Alex's issues this week.

      Katy created docs for installation and usage of Shoveler. It's been running in test for 2 years! Am hoping that Jyothish(?) will find time to bring this into production soon. Katy is planning to validate the numbers for CMS and in collaboration with other VOs write a CHEP paper on this. 

      (CHEP abstract deadline is this Friday! ...but will be extended by one week.)

       

    • 14:20 14:21
      VO-Liaison ATLAS 1m
      Speakers: Dr Brij Kishor Jashal (RAL, TIFR and IFIC), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 14:25 14:26
      VO Liaison Others 1m
    • 14:30 14:31
      VO Liaison LHCb 1m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)

      LHCb:

      • Checksum setup was incorrect on the preprod farm last week.
        • As a result, all downloads were failing
      • Issues with CNAF-RAL transfers
        • CNAF storage uses authentication key in their http headers, our xrootd version only accepts Authentication (case sensitive)
        • Xrootd upgrade attemted yesterday, caused even more problems (though the issue with the keys was solved)
        • Rolled-back to the old version
      • After the prefetch change roll-out some WGprod jobs are still causing xrootd proxy to run out of memory
        • The scale of the problem is not clear yet
      • Copying to RAL from RRCKI is finished
        • Some final cleanup may  be necessary
      • Lists of files affected by xrootd bug are ready

       

      Alice:

      • CS-147 ticket needs reaction
    • 14:35 14:38
      VO Liaison LSST 3m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))

      Data still ingesting from Lancaster - ingested 15267 datasets out of 19852

          Just ran into broken pipe issue, so will need to modify ingest code to skip datasets alreay ingested, as the butler ingest-raw comamnd does not filter out the already done 

              Should be again re-ingesting this afternoon, with completion this friday. If issues do not persist.

         Once RAWs are ingested can configure it into a collection in the Butler and DC2 tests can be completed at RAL

       

      Starting data movement using LSST Rucio raw data to RAL

       

       

    • 14:40 14:41
      VO Liaison APEL 1m
      Speaker: Thomas Dack
    • 14:45 14:46
      AOB 1m
    • 14:50 14:51
      Any other Business 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore