Repack Status and plans for September production:

FTS:

 

Input from Michael:

My updates to action list:

    1. Namespace split-up: agreed with Cedric 05/06/2019. See summary in last week's slides. Giuseppe to validate that files under /castor/cern.ch/atlas/atlascerngroupdisk are safely stored in EOS and don't need to be migrated. DONE

    2. Georgios is doing preliminary analysis which will eventually come up with a set of metrics to allow us to create an economic model of colocation (measure cost/benefit of different optimisation strategies).

New items:

    1. Compile EOS version with required changes: gRPC API, new checksum protobuf format. This is done in EOS v4.5.0. (Also includes XRootD 4.10 and prepare request tracking, though these features are not required for migration) DONE

    2. Merge CTA schema changes into master. I have rebased my branch on master, made required changes and will complete testing today. Will coordinate with Julien to merge back into master as he plans to do a release before the merge. Aim to have this done by Monday 1/7/2019.

    3. Review final DB schema for migration. Deadline 3/7/2019 (before Giuseppe goes on holiday).

    4. Create DB migration tools for ATLAS instance: alter schema and convert checksums and uid/gid to new format.

    5. Update EOS namespace injection tools to use new gRPC API.

    6. Small-scale metadata migration to validate all tools and workflow for the migration, including handling failure modes.

    7. Milestone: CASTOR DB to be moved to new hardware. Propose to move CTA ATLAS DB during the same maintenance window. Date to be set by DB team, I believe it is going to be around 15 July, Giuseppe is coordinating with DB team.

    8. Milestone: Week 22-26 July: Full-scale ATLAS migration test (metadata only, no tapes). This is a functional and performance test. It will allow us to accurately estimate the time needed to do the real migration and to consider if we need to make any further optimisations.


 

Update from Eric on backstop/backpressure status (timelines for September to be added):

Feature

Area

Status

Disk system list (C++ struct):

Description of disk system: name, regex to match file URLs, URL to query the free space.

Catalogue

Preliminary

Disk system list management: storing and management of the disk system list.

Catalogue, frontend

To be done, pending c++ struct definitive

Support in retrieve request: attach the file system name

Objectstore

Preliminary

Support in retireve queue:

Keep track of the file system name for queued requests

Objectstore

To do

Space allocation tracking object:

Keep track of the space committed but not used yet, per disk system.

Objectstore

To do

Support in queuing:

Classify requests, add info in queue.

Scheduler

Preliminary

Support in popping (the main part):

Integrate the querying of the space tracker, possibly the disk system, and requeue the requests in case of failure.

Scheduler

To do

Support in retrieve mounts: keep track of (temporarily) full disk file systems.

Objectstore+scheduler

To do

Support in mount scheduling: skip mounts for which we found no space (sleep the mount 15 minutes).

Objectstore+scheduler

To do

 

Action list:

Actions
who what by when
Eric Agree with ATLAS on list of "activities" and configure via cta-admin. Deploy "activities" on ATLAS 27/5
Cédric Implement repacking taking into account disabled tapes and drive dedications 30/5
Julien Ensure CTA team is copied in exchanges with ATLAS and other experiments. 24/5
Julien talk to procurement and network people (to ensure all network infrastructure is in place when nodes arrive) 30/5
Michael Ensure that Georgios gets in touch with Luc to advance discussions on modelling collocation hints and assessing their usefulness. 30/5
Julien/Andrea Explict stager_rm follow-up 13/6
Andrea Agree Rucio->FTS metadata format for collocation hints and storage classes  13/6
Eric propose and discuss with FTS team format how to receive collocation hints (in addition to storage classes and activities) from FTS. 13/6
Julien Identify what is the right hardware to run migration 13/6