RAL Tier1 Experiments Liaison Meeting

Name: RAL Tier1 Experiments Liaison Meeting
Start: 2024-06-05T13:30:00+01:00
End: 2024-06-05T15:00:00+01:00
Location: RAL R89

Wednesday 5 Jun 2024, 13:30 → 15:00 Europe/London

Access Grid (RAL R89)

Access Grid

RAL R89

66811541532

Alastair Dewhurst

Join via phone

- 14:00
  
  Major Incidents Changes
- 1
  
  Summary of Operational Status and Issues
  
  Speakers: Brian Davies (Lancaster University (GB)), Darren Moore, Kieran Howlett (STFC RAL)
  
  Weekly Report 05 June 2024 (1).docx
  
  Weekly Report 05 June 2024 (1).pdf
- 14:10
  
  Experiment Operational Issues
- 2
  
  VO Liaison CMS
  
  Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
  
  A smooth week for CMS, good performance and green tests.
  
  A small number of repeatedly failing production tape transfers (file not on Echo) - Katy to investigate.
  
  There is a new AAA proxy machine - Katy to add to monitoring in Vande (also Shoveler once we have the production instance).
  
  Jyothish did upgrades for the old AAA machines ceph-gw10/11.
  
  Asked George to remove the dependance of cms-rucio-services machine on VOMS infrastructure. CMS wish to remove this (ATLAS did so this week), at the very very latest the end of the month. This machine also needs upgrades and reboot.
  
  Any schedule yet for tokens on the batch farm?
  
  Tentative plans for UK transfer tests (DC24 levels?) after data-taking in November or December. Perhaps additional tests in spring before data-taking in 2025 for TAPE, if that's when Antares gets connected to the OPN.
  
  DC24 report is here: https://zenodo.org/records/11401878
- 3
  
  VO-Liaison ATLAS
  
  Speakers: Dr Brij Kishor Jashal (RAL, TIFR and IFIC), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
  
  ATLAS update
- 4
  
  VO Liaison Others
- 5
  
  VO Liaison LHCb
  
  Speaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
- 6
  
  VO Liaison LSST
  
  Speaker: Timothy John Noble (Science and Technology Facilities Council STFC (GB))
  
  in /etc/lsst/prolog.sh, export LSST_RUN_TEMP_SPACE="/tmp/lsst/sandbox" . It will not work for Rubin jobs. For rubin jobs, LSST_RUN_TEMP_SPACE is a shared space for jobs at different worker nodes. Rubin jobs are different from ATLAS jobs that every single job is independent. Rubin jobs are not independent. The first Rubin job in a workflow will create a directory and write some information to LSST_RUN_TEMP_SPACE , then other jobs read and write information in LSST_RUN_TEMP_SPACE. at the end, the final job will read all information the workflow directory in LSST_RUN_TEMP_SPACE .
  
  Slac has a shared POSIX file system mounted on all nodes.
  Rubin can support different storages. S3 is supported. - Therefore could use the S3.echo
  
  Fabio HernandezFabio Hernandez 1:50 PM
  
  In our site, our Slurm compute nodes have some local storage capacity. We use that capacity for jobs to use as working storage. Once the job finishes, the data in that area is deleted.There is another area that is needed by PanDA and I think that is what you refer to. That area store some information needed by a campaign to update the Butler registry database with the data produced and stored by the jobs in the Butler data store. The data stored in that area is relatively small but indeed needs to be available by the so called “Final Job” to do its data registration work. In our case, that area resides in CephFS and is mounted by all the Slurm compute nodes.It seems to me that is described in some document. Let me try to find out where.
  
  1:51
  
  Fabio Hernandez
  
  I think this is the document: https://panda.lsst.io/admin/site_environments.htmlThe relevant piece is the area named LSST_RUN_TEMP_SPACE
- 7
  
  VO Liaison APEL
  
  Speaker: Thomas Dack
- 14:45
  
  AOB
- 8
  
  Any other Business
  
  Speakers: Brian Davies (Lancaster University (GB)), Darren Moore

Choose timezone

RAL Tier1 Experiments Liaison Meeting

Access Grid

RAL R89