CTA deployment meeting

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map

Status of operational procedures and scripts

  • See slides presented here.
    • Supply pool logic: Ready to be put in production.
      • Pool starvation (due to unavailability of libraries and/or empty supply pools) is detected and error logs generated, an alarm still needs to be added.
      • Authentication via Rundeck: solved
      • Will be deployed the week of 23/7
    • Daily TSMOD report:
      • missing (Vlado to create ticket)
    • Labelling:
      • CTA metadata checks done. LBP support is not available yet. As discussed after the meeting, Viktor will implement the CERN-independent labelling functionality by modifying the CASTOR current one (see gitlab ticket)
    • Tape (un)mount: done
    • Tape media-check, drive-test, eject:
      • WIP. Timelines to be provided by David
    • Alarm system:
      • WIP. Timelines to be provided by David
    • Tape pool and queue monitoring:
      • Including fine-grained queue monitoring as currently available via CASTOR stager activity plots (see below). Timelines to be provided by Aurélien
  • Other operational GUI's/interfaces:
    • TOMS (link):
      • Tape services and drives are and will remain operational for both CASTOR and CTA. Tape pool configuration for CTA is done via cta-admin, no GUI needed.
    • CASTOR dashboard - name server (link):
      • Needs discussion with Giuseppe. May require additions such as running counters for tracking per-VO file/volume information.
    • CASTOR stager activity (link):
      • Several plots already covered via existing CTA grafana plots provided by Julien. Fine-grained queue monitoring for migrations and recalls will now be possible to obtain via cta-admin and should therefore be integrated into the tape pool and queue monitoring (Aurélien).
    • Data Volume on tape (link):
      • This plot is based on tapelog accounting and is used by DHO members. Germán to take over.
    • Tape stats by density history (link):
      • Plot based on tapelog and VMGR weekly dump information. Requires an equivalent to nslisttape -s. Germán to take over.

Status for full-scale metadata migration test:

  • See attached input from Michael (link)
  • DB schema will be reviewed at next deployment meeting.
  • While the primary target is to measure metadata import performance (~90M ATLAS files) onto a new EOS+CTA environment, a functional test involving recalling from actual ATLAS tapes should be performed as well. This requires re-allocating at least a drive per (CTA visible) DGN to a separate CTA instance; one SSD disk server should be enough (action on Julien)
    • Imported CASTOR tapes (FULL tapes) will appear as "fromCASTOR" and will not be writable by CTA. They will still be visible and accessible from CASTOR without requiring any change on the CASTOR side.
  • Questions from Michael:
    • Storage classes and tape pools will be imported 1-1 from CASTOR. It is easy to merge them afterwards on the real production environment.
    • Archive routes will be setup via cta-admin, no automation required
    • Symbolic links: Does ATLAS have any? Michael to check.
    • Extended attributes: Julien will provide guidance to Michael on which attributes / attribute links need to be set.
    • Note that (while OK for ATLAS) enforcing sending storage class information for each filename is likely not going to work for all VO's so support for inheriting storage class information will be required. This needs to be discussed with Eric.

Discussion with Massimo / ALICE

  • See attached document (link).
  • While data rates to tape as presented to the GDB are ~7GB/s (link), Massimo assumes 30GB/s.
  • What would be the equivalent of df in EOS? See answer by Andreas (link).
  • We should tell ALICE to check for tape residency.
  • To  be continued at the next meeting

 

There are minutes attached to this event. Show them.
    • 14:00 14:20
      Status of operational procedures and scripts 20m

      The following items are expected for September:
      - labelling and supply pool management
      - completion of drive (collectd) and central (tapelog) alarm monitoring
      - displays enhancing/replacing still missing tapeops Grafana plots and plots not yet available from CASTOR dashboard

      Speakers: Aurelien Gounon (CERN), David Fernandez Alvarez (Universidad de Oviedo (ES)), Vladimir Bahyl (CERN)
    • 14:20 14:40
      Status for full-scale metadata migration test (22-26 July) 20m
      • DB schema
      • Import scripts
      • CTA SW deployment
      • Test procedure updates (was on 20/6:)
        Full-scale test will consist in
      • importing all FULL ATLAS CASTOR tapes into a test EOSCTA instance (~90M files, 6500 tapes)
      • run mass file retrievals on a subset of these tapes (~100 tapes). (These tapes can be set on CASTOR as EXPORTED if we want to avoid failing recall errors in the case of concurrent requests, but such errors should be harmless)
      • Invite ATLAS to run retrievals
      Speakers: Giuseppe Lo Presti (CERN), Michael Davis (CERN)
    • 14:40 15:00
      ALICE clarifications 20m
    • 15:00 15:20
      Review of actions, AOB, items for next meeting 20m