CTA Development Priorities
Discussion on Fri 9 October. Present: Cédric, Julien, Michael, Steve.
We agreed the following priorities for the coming weeks for tasks related to repack:
These features have been implemented by Cédric. They should be deployed and tested.
2. Tape Server: Add configuration options for maintenance process.
To give greater control over the highly-distributed maintainence process:
a. Add an option to disable repack operations on a per-tapeserver basis, so we can control which tape servers are handling repack jobs.
b. Add an option to disable the maintenance process entirely, so we can run maintenance jobs on a smaller cluster of tapeservers. This means that when we are tracing errors, we only have to check the logs of one or two dozen tape servers instead of distributing them over a couple of hundred.
3. Remove "superseded" files in favour of the recycle bin.
This is a simplification: instead of two ways of handling deleted and repacked files, we settle on one mechanism, the recycle bin. "Superseded" should be removed.
Files in the recycle bin will not appear in tape listings using the normal tools ("cta-admin tapefile ls" etc.) and there will be no option to list them.
This task includes a separate operator tools to list files in the recycle bin and to delete them. (Q. Should operators be able to manually delete files from the recycle bin or should this be done automatically when tapes are reclaimed?)
In future we will want a tool to allow reinjection of deleted files from the recycle bin, but this is not an immediate priority, as it can be done manually by a developer if necessary. We will develop the tool when the need arises.
4. Investigate if we can run a separate repack instance sharing the same catalogue.
There are a number of problems caused by mixing repack with normal operations, because there is no separation of repack queues from the normal retrieve queues. It would be a big development effort to add separate repack queues.
An alternative proposal is to run a separate repack instance, which has its own object store and dedicated tape drives. Only the catalogue and the tapes would need to be shared with the "main" production CTA instance.
This proposal needs to be investigated to confirm it is feasible, to propose a solution to contention for tapes between the main instance and repack instance, and to identify any other problems that may arise.
In parallel, Steve will work on solving the garbage collector problems identified in this week's tests.
Q. Should operators be able to manually delete files from the recycle bin or should this be done automatically when tapes are reclaimed?
It is currently the case : if a user deletes all the files from a tape and reclaim it, the files are deleted from the recycle-bin. Cédric believes we should do the same for repacked tapes.
In this case we don't need a tool for operators to delete files from the recycle bin manually. (At least, let's not invest effort in that until someone comes up with a use case. Just listing the files in the recycle bin is enough for now.)