CTA deployment meeting

Europe/Zurich
600/R-001 (CERN)

600/R-001

CERN

15
Show room on map

Migration status:

  • Slides presented are found here. See also related gitlab ticket
  • When running in parallel, DB metadata import runs at 2kHz which is largely sufficient.
  • Two flags are missing, FROMCASTOR and RDONLY (tickets to be created). As a reminder, we will never append data to CASTOR tapes
  • Final review of the DB schema will be done by Michael / Giuseppe. DB upgrade scripts à la CASTOR will come later, priority is to consolidate final DB schema.  
  • All CTA team members will be around on the week of 22-26 July when the full-scale metadata migration test will run.
  • Full-scale test will consist in
    1. importing all FULL ATLAS CASTOR tapes into a test EOSCTA instance (~90M files, 6500 tapes)
    2. run mass file retrievals on a subset of these tapes (~100 tapes). (These tapes can be set on CASTOR as EXPORTED if we want to avoid failing recall errors in the case of concurrent requests, but such errors should be harmless)
    3. Invite ATLAS to run retrievals

Pending tasks, priorities for deployment

  • Operational scripts and dependencies (Vlado):
    • Vlado will present status and plans on 4/7. The following items are expected for September:
      • labelling and supply pool management
      • completion of drive (collectd) and central (tapelog) alarm monitoring
      • displays enhancing/replacing still missing tapeops Grafana plots (link) and plots not yet available from CASTOR dashboard (link)
    • Vlado reminds of the need for drive dedications (to be implemented) as these allow (among other use cases) avoiding putting problematic drives into production
  • Backstop:
    • implementation ongoing by Eric (link) with top priority.
  • GC:
    • mail from Steve:​​​​​​​
      • The overall status of the CTA GC work is not finished.  Going into finer detail:

      • The FST GC now accounts for delays in stagerrm.
      • The FST GC python code is now unit tested using the unittest module of Python.
      • The FST GC is currently being modified so that it can work in the multi-VO FST disk server boxes of Julien.
      • Further testing of the FST GC still has to be completed before it can be said to be done.
  • FTS:
    • Storage Classes, Activities, Hints: These will be propagated to EOSCTA via FTS; Andrea is following up with ATLAS (Cédric) for finalising their parameter syntax, no major work involved here.
    • Buffer cleanup via XROOT eviction: XROOT protocol part completed by Andy/Michal. FTS support will be testable next week; Steve will check how to propagate this to "eos stagerrm".
    • FTS multi-hop is supported now and ready to be used by ATLAS (Andrea will follow up).
    • FTS "m" bit check: Completion expected during July.
  • Repack:
    • Cédric / Eric will present status and plans at the next deployment meeting in view of September production usage.

 

Action list (not reviewed during the meeting, to be updated by each action owner):

Actions
who what by when
Eric Agree with ATLAS on list of "activities" and configure via cta-admin. Deploy "activities" on ATLAS 27/5
Michael Complete (with Cédric S.) namespace split-up 30/5
Cédric Implement repacking taking into account disabled tapes and drive dedications 30/5
Julien Ensure CTA team is copied in exchanges with ATLAS and other experiments. 24/5
Julien talk to procurement and network people (to ensure all network infrastructure is in place when nodes arrive) 30/5
Michael Ensure that Georgios gets in touch with Luc to advance discussions on modelling collocation hints and assessing their usefulness. 30/5
Julien/Andrea Explict stager_rm follow-up 13/6
Andrea Agree Rucio->FTS metadata format for collocation hints and storage classes  13/6
Eric propose and discuss with FTS team format how to receive collocation hints (in addition to storage classes and activities) from FTS. 13/6
Julien Identify what is the right hardware to run migration 13/6

 

There are minutes attached to this event. Show them.