FTS update

Meeting with LHCb

LHCb has a requirement to transfer data from CTA to Tier-1s (reprocessing use case). These transfers are orchestrated by LHCb.

This would require 4-6 T1s to upgrade their infrastructure. The following options are on the table, ordered from most to least desireable from LHCb's POV:

Eddie is part of a working group to track the schedule for upgrades of dCache in the T1s. He reports that the following T1s have upgraded already: fzk, ndgf, in2p3-cc, sara. All T1 sites are obliged to provide an alternative to gridFTP by end of 2019.

The new version of gfal (with "query prepare") was released and is in pilot, it will be deployed in production in around one week. The "check m-bit" feature will be in a future release.

Actions

CTA test status

Repack test

The bug that stopped repack was fixed.

If the EOS instance runs out of disk space, the file to be written is truncated during the flush to disk, but the CLOSEW event is executed anyway. In this case, the file is written to tape but then fails afterwards because the size/checksum don't match. Ideally we would like to prevent the file being written in the first place.

Actions

DB Schema Versioning

The DB schema for CTA v1.0 will be finalised on 6 Dec 2019. This includes the columns needed for future features like RAO+LTO.

Cédric is working on DB schema versioning.

The CTA catalogue will be wiped after the ATLAS recall test in January. The migration will be done onto a fresh install of the DB schema.

Actions

CTA version 1.0

Everything to be included in CTA version 1.0 should be committed to master by Friday 6 December. The process of tagging/creating RPMs etc. will be done w/c 9 December.

Testing: Oliver mentioned that various suites of test scripts and tools exist for DPM and gfal.

Actions

Documentation

Documentation has been reorganised into developer docs (LaTeX/PDF) and operator docs (mkdocs).

Monitoring

There are 3 main use cases for monitoring:

  1. Management information: historical info, e.g. evolution of the size of data/number of files stored on tape
  2. Day-to-day operational information (tape operators)
  3. Performance management (developers)

The data collection for case (1) needs to be in place before we start taking physics data, i.e. before we migrate ATLAS.

There are several things missing from our current monitoring:

Daniele Lanza's monitoring (goes to HDFS?) and Aurelian's collectd sensor: these are done for CASTOR and need to be done for CTA?

Actions

CASTOR PUBLIC

It was agreed that the CASTOR backup hardware decommissioning will not influence the migration schedule for CTA.

There are several things which need to be done before we can migrate the backup services to CTA:

To be revisited next year.

AOB