RAL Tier1 Experiments Liaison Meeting

Europe/London
Access Grid (RAL R89)

Access Grid

RAL R89

Zoom Meeting ID
66811541532
Host
Alastair Dewhurst
Useful links
Join via phone
Zoom URL
    • 14:00 14:01
      Major Incidents Changes 1m
    • 14:05 14:06
      Summary of Operational Status and Issues 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore, Kieran Howlett (STFC RAL)
    • 14:10 14:11
      Experiment Operational Issues 1m
    • 14:15 14:16
      VO Liaison CMS 1m
      Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))

      A smooth week for CMS, good performance and green tests. 

      A small number of repeatedly failing production tape transfers (file not on Echo) - Katy to investigate.

      There is a new AAA proxy machine - Katy to add to monitoring in Vande (also Shoveler once we have the production instance).

      Jyothish did upgrades for the old AAA machines ceph-gw10/11.

      Asked George to remove the dependance of cms-rucio-services machine on VOMS infrastructure. CMS wish to remove this (ATLAS did so this week), at the very very latest the end of the month. This machine also needs upgrades and reboot. 

      Any schedule yet for tokens on the batch farm?

      Tentative plans for UK transfer tests (DC24 levels?) after data-taking in November or December. Perhaps additional tests in spring before data-taking in 2025 for TAPE, if that's when Antares gets connected to the OPN. 

      DC24 report is here: https://zenodo.org/records/11401878

       

    • 14:20 14:21
      VO-Liaison ATLAS 1m
      Speakers: Dr Brij Kishor Jashal (RAL, TIFR and IFIC), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
    • 14:25 14:26
      VO Liaison Others 1m
    • 14:30 14:31
      VO Liaison LHCb 1m
      Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
    • 14:35 14:38
      VO Liaison LSST 3m
      Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))

      in /etc/lsst/prolog.sh, export LSST_RUN_TEMP_SPACE="/tmp/lsst/sandbox" . It will not work for Rubin jobs. For rubin jobs, LSST_RUN_TEMP_SPACE  is a shared space for jobs at different worker nodes. Rubin jobs are different from ATLAS jobs that every single job is independent. Rubin jobs are not independent. The first Rubin job in a workflow will create a directory and write some information to LSST_RUN_TEMP_SPACE , then other jobs read and write information in LSST_RUN_TEMP_SPACE.  at the end, the final job will read all information the workflow directory in LSST_RUN_TEMP_SPACE .

       

      Slac has a shared POSIX file system mounted on all nodes.
      Rubin can support different storages. S3 is supported. - Therefore could use the S3.echo

       

      Fabio HernandezFabio Hernandez  1:50 PM
      In our site, our Slurm compute nodes have some local storage capacity. We use that capacity for jobs to use as working storage. Once the job finishes, the data in that area is deleted.There is another area that is needed by PanDA and I think that is what you refer to. That area store some information needed by a campaign to update the Butler registry database with the data produced and stored by the jobs in the Butler data store. The data stored in that area is relatively small but indeed needs to be available by the so called “Final Job” to do its data registration work. In our case, that area resides in CephFS and is mounted by all the Slurm compute nodes.It seems to me that is described in some document. Let me try to find out where.
       
      Fabio Hernandez
      I think this is the document: https://panda.lsst.io/admin/site_environments.htmlThe relevant piece is the area named LSST_RUN_TEMP_SPACE
    • 14:40 14:41
      VO Liaison APEL 1m
      Speaker: Thomas Dack
    • 14:45 14:46
      AOB 1m
    • 14:50 14:51
      Any other Business 1m
      Speakers: Brian Davies (Lancaster University (GB)), Darren Moore