EOSCTA Post-Migration Issues
LHCb
- Failing XRootD TPC transfers to RAL are causing noise which makes it difficult for Julien to evaluate his solution for evict after HTTP prepare.
- Oliver will ask Chris if he can switch off checksum validation temporarily, at least until Julien has checked that his solution works satisfactorily.
NA62
- Vova: check if !d works correctly on directories with no CTA workflow enabled.
- Vova: follow up with EOS devs to ensure that
p
and !d
permissions work correctly when there is no w
permission.
- The problem with transfers from T2s (INC2748013) was not present in CASTOR because it was more permissive (firewall open to sites outside LHCONE). User is happy to write the code to transfer the file in 2 hops via EOS PUBLIC, but says there is not enough storage. Solution: ask how much storage they need for this and ask EOS team to provide a space for these intermediate hops. Note: this use case will no doubt come up again with other experiments.
- Query if file is on tape using gfal2 Python API: was resolved by upgrading to latest packages.
CTA PUBLIC Migration from CASTOR
DUNE
- Ready to migrate after Easter.
- Vlado repacked the 6 tapes.
- Steve and Vova will write OTGs for CASTOR/CTA.
n_TOF
- Vova: check up on the exact use case for "nsfind"/"eos find". We need to know roughly how many files are in the directories scanned and how frequently this operation will be performed.
- We don't have a solution to the "early-file review" problem, where n_TOF examine the detector signals of DAQ files (like in offline analysis). Ideally n_TOF would like a copy of archived raw data to stay around in the CTA spinner space, but this is not possible without development effort. n_TOF have two alternative solutions, which they plan to evaluate later this year:
- Modify FileDisplay to re-construct files directly from the crates, providing a really early aggregated overview of acquired data.
- Implement FTS back-end in RawFileMerger (instead of Xroot) to upload the file in 2 places with a single stream.
- Sylvain said, "We can schedule the evaluation for the first solution sooner, it'll give a better idea of the workload on our side."
- FTS multi-hop was also proposed to them but they don't want to have to manage clean-up of the intermediate file when the early file review is finished.
- Given this blocker, it seems unlikely that it will be possible to migrate n_TOF before their physics schedule starts in the summer.
- Michael: arrange a meeting with n_TOF representatives to discuss the likely schedule and ensure that they are aware of the consequences of not migrating (recalls will be slow due to limited tape drives).
Status Updates: Experiments
- Vova has pinged COMPASS to ask if they can start testing recalls to the spinner space.
- Vova will contact CLIC in April.
- AMS are blocked on authentication problems. To be followed up.
- Michael will respond to RQF1772720 and follow up on ILC. Migration of ILC depends on the repack of public_user.
Status Updates: Backup
- Ideally we want to migrate CASTOR backup use cases by September, as Steve will have to merge the CASTOR disk pools and Giuseppe advised not to merge backup disk pools with the other ones.
- Repack/migration of data from CASTOR to CTA is not a blocker as we could migrate the users and leave most of the files in CASTOR until they expire. In principle only a small amount of data actually has to be migrated.
- Encryption is a prerequisite however.
Status Updates: Other Use Cases
- ~30 tapes with LEP tapepool data still to be repacked.
- Migration of LEP tapepool can be scheduled when Vlado is back after Easter holidays.
CTA PUBLIC New Use Cases
- References to FTS Pilot in KB articles should be replaced with FTS Public (Michael/Vlado)
- We will pass Vlado's Codi document to Spacal and FASER and let them test the instructions. Edit according to their feedback and final version will be published in the KB.
RAL
- Michael will follow up on Alastair's question about NA62 and FTS multi-hop.
- We would like to have a schedule from RAL on the critical milestones and where they will need help from us.
CTA Software
- Michael will start performance tests on Postgres, initially using the instance that the DB team provided to Julien. When we reach the limit of performance we will set up our own test instance which we can tune ourselves.
- We will test queue sizes of at least 100 million entries.
There are minutes attached to this event.
Show them.