CTA deployment meeting

Europe/Zurich
31/1-012 (CERN)

31/1-012

CERN

6
Show room on map
Michael Davis (CERN)

Schema Validation Tool

We agreed that the Liquibase solution does what we want for the time being. The risk of software abandonment or lock-in is low, because we can easily replace Liquibase with something else.

Actions

  • Cédric: add check of schema version before performing the upgrade
  • Cédric: add an extra column to record if the DB is in an intermediate state ("upgrading"). Do not allow CTA to start if this is set to true.
  • Cédric: add procedure for upgrading the schema to the tape operations website.

 

ATLAS Recall Exercise

CTA v1.0-3 is deployed, EOS and FTS running on XRootD 4.11.0.

Note that we have to use ctaprod for the recall exercise because the source and target DBs for migration have to be on the same Oracle instance, and ctaprod is the only DB schema available on the castor production instance.

Update after the meeting: ATLAS migration was done on Friday/Saturday. Julien ran tests over the weekend. We are ready for ATLAS recall exercise starting on Monday.

 

Other EOS+CTA Testing

The concurrent archvial/retrieval/deletion "mutex" test and Rucio+FTS multi-hop test are in progress.

 

Repack

Five tapes were repacked one-at-a-time.

Next test changed the mount policy to use three drives in parallel. There was an intervention on the library so drives had to be shut down. When they were brought back up, repack restarted but did not complete properly. This was identified as a logic problem in the queuing system when no drives are available.

Note: This problem is a general one which does not only affect repack.

Vlado will continue with tests when drives are available again (after ATLAS recall exercise): 10 tapes × 3 drives; 10 tapes × 5 drives; etc.

David is working on CTA repack automation scripts.

Actions

  • Fix queuing logic when no drives are available (see issues #736 and #737)

 

EOS Issues

  1. Issue with TPC identified as an XRootD bug introduced in v4.11.1. The workaround is to revert to v4.11.0 until a fix is available.
  2. Different XRootD versions and resulting confusion: EOS team have a solution, we need to ensure CTA is consistent with what they are doing.
  3. FST is inheriting from a class in the private area of XRootD, which caused an ABI incompatibility. In principle production software should not rely on private classes. To be resolved.
  4. QuarkDB: there is apparently a version available which does not depend on XRootD. However, we do not want to be the guinea-pigs.
  5. EOS immutable files: it is imperative that we set the immutable file attribute on tape-backed directories. The issue with immutable files must be fixed in EOS before we go into production.

 

AOB

  • Julien will ask for more CI runners so that our pipelines can complete in a reasonable time.
There are minutes attached to this event. Show them.
    • 14:00 14:10
      Schema Validation Tool 10m

      Cédric: Update and proposals

    • 14:10 14:20
      ATLAS recall exercise 10m

      The ATLAS reprocessing campaign has been delayed by a couple of days due to a bug in the reprocessing software. The new tentative starting date is beginning of next week. We expect an update on Friday.

      What remains to be done on our side?

    • 14:20 14:30
      Other EOS+CTA Testing 10m

      Status update on the following test activities:

      1. Test concurrent archival, retrieval, deletion ("the mutex test")
      2. Rucio+FTS multi-hop test
      3. ATLAS archival stress testing (will be done after recall exercise)
    • 14:30 14:40
      Repack 10m

      Vlado: report on status of repack test

    • 14:40 14:50
      Brief status updates 10m
      1. The FST delete on close has been implemented and a test is in CI. To be merged to master.
      2. cta-admin tapefile ls is now in CTA master. archivefile ls is deprecated and will be removed in due course.
      3. Steve has a solution for the immutable files issue.
      4. Michael is meeting with Melissa on Monday to discuss the CTA logo
      5. Reminder that the CTA repository is public and must not contain any 3rd party software
    • 14:50 15:00
      AOB 10m