RAL Tier1 Experiments Liaison Meeting
Access Grid
RAL R89
-
-
1
Site Operations
-
13:34
Experiment Operational Issues
-
2
ATLAS Operations ReportSpeakers: Dr Brij Kishor Jashal (Rutherford Appleton Laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
-
3
CMS Operations ReportSpeaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
Production jobs and SAM tests: BAU. Nice and quiet over the Easter weekend.
Just a couple of periods of SAM test failures on the AAA system - general improvements for this service still to do.
Issue with /store/unmerged/ on Echo not being cleaned up for all files: CMS keeps files that require merging via Merge jobs in this 'directory'. These files are not managed by Rucio. We run Cleanup jobs, which delete unmerged files and typically work well at RAL. However, some files remain, and CMS has a mechanism to delete these after a certain period and after checking those files are no longer needed. This uses ls of directories and does not work on Echo. The test is always green though! Files have built up over the years - we can delete the majority of them. We are considering a long-term solution.
DC27: 50% of HL-LHC challenge. Proposed for last week of Feb and first week of March.
-
4
LHCb Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
- Quiet Easter break, no noticeable issues @RAL.
- Corrupted echo files found (GGUS:1002197)
- Affected files re-replicated from other sites (where possible)
- Corruption reason for most of the user files (~270) understood -- user uploading std.out of the jobs
- Resource confirmation request (GGUS:1002186)
- Any news on the SRR update?
-
5
ALICE Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
-
6
LSST Operations ReportSpeakers: Thomas Birkett, Timothy Noble (Science and Technology Facilities Council STFC (GB))
- Mapping of LSST files in echo using the dumps - LSST have left a lot of files behind not tracked by Rucio - producing file lists to ensure we can check when files are not needed abd delete them
- Modified repo (hsc_pdr2_multisite) from US side - IngestD did not ingest the changes,
- Investigation lead to x509 issue
- Cloud issue over the weekend delayed investigation
- Issue with moving x509 into the pod - copied to correct permissions
- Issue corrected and should now no longer be an issue going forwards
- After ingestion of the data new jobs:

- 20Tb moved to RAL of RAW data from IN2P3

- LSST running Data movement tests - RAL showing nearly 10GB/s for the test - twice the mean speed of LANCS
- A 1/10 speed of IN2P3

- FTS transfers analysed for small files (200bytes)
42 seconds total transfer from SLAC to RAL
21 of those was getting checksum from SLAC
RAL times a fraction of those (checksum) or the same as SLAC
-
14:00
Tier-1 Projects
-
7
Antares UpgradeSpeakers: George Patargias, Thomas Byrne
-
8
XRootD DevelopmentSpeakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)
-
9
Utilizing GPUsSpeakers: Dr Brij Kishor Jashal (Rutherford Appleton Laboratory), Thomas Birkett
-
14:25
AOB
-
10
Summary of Operational Status and IssuesSpeakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore, Thomas Birkett
-
11
Any other BusinessSpeakers: Brian Davies (Science and Technology Facilities Council STFC (GB)), Darren Moore
-
1