CTA deployment meeting

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

5
Show room on map

ATLAS next steps: (following discussion with Alessandro on 22/5)

  • Activities:
    • Andrea: ATLAS have 13 activities configured in FTS, but only one is used for retrieving ("staging"). We have discussed with Alessandro on setting (at least) two, "grid" and "t0".
    • Activity is sent to EOS in extended attribute format but it is never set as it is transient information.
    • Eric will circulate an example of how the URL looks like.
    • Eric will contact ATLAS in order to agree on list of activities and exact details. Timeline is still next Monday for completing interface with ATLAS. Back-end still requires work: integrate in mount scheduling, CLI interface.
  • Storage classes:
    • Alessandro to send us the names of the initial ~10 storage classes for ATLAS so that we can start configuring them
    • Eric will communicate the exact URL parameters to use for setting the class for each file to be transferred.

ATLAS instance status:

  • ATLAS staging test:
    • Julien has agreed with Cédric S. to use CTA instead of CASTOR for the next tape carroussel test. ~200TB of data will be created and then recalled. File creation starting ~now, recalls starting Monday, exercise to be finished on Wednesday.
    • Julien: All required HW, media, configuration (e.g GC) is ready: everything under control. 
  • ATLAS data flow:
    • Julien has agreed with D. Cameron to subscribe EOSCTAATLAS to ~1% of Rucio data creation. This will bring the long-awaited continuous data flow from ATLAS.
  • Wiping older files:
    • ATLAS will clean older files in Rucio and expect us to do the same ourselves in CTA as this is the fastest mechanism. However, this will not be appropriate nor desirable once the switchover is completed.  Julien to follow up with ATLAS on how deletions should take place. Julien to check with CMS whether files can be removed for easing schema upgrade.
  • Communications with ATLAS:
    • the CTA team should be copied in exchanges with ATLAS members. Julien to ensure CTA team is copied in exchanges with ATLAS and other experiments.
  • Hardware delivery:
    • Julien expects hardware delivery in September. Having a sufficiently large buffer size is critical for deployment. Julien to talk to Bernd / procurement to obtain precise timelines.

Migration status:

  • ATLAS have discussed migration on their side and have concluded that their best approach for CTA migration is to replace the current CASTOR instance prefix with the CTA one inside Rucio. This means that data that is not in CASTOR will not be migrated inside Rucio and will need to be cleaned up in CTA.
  • Full test import (Michael): Target is to complete migration scripts and validate a full metadata test import in mid-June.
  • Namespace cleanup (Michael): atlas_user fileclass - still 800K files that need to be split across Rucio and user data. Michael to follow up with Cédric S. and ensure this is completed.

 

FTS (Andrea, Eddie):

  • FTS multi-hop: Andrea has setup a test instance for multi-hop validation - ATLAS will come up with a corresponding plan.
  • Check for tape residency: Will be ready by end of June.
  • Check for free disk space: Likely to not happen before October (therefore after migration). The upcoming tests will tell us more about the criticality of this functionality (for which we will have to take into account that disk space will significantly increase with the new hardware expected for September).

 

Collocation hints:

  • We presented our tape read efficiency investigations to ATLAS (cf link, slides 15ff). Our conclusion is that multiplexing enables substantial latency and performance gains.
  • We are happy to continue research in this area (spearheaded by Georgios), but as agreed with Alessandro, working on this research activity is orthogonal to the ongoing CTA deployment.
  • Action on Michael: Ensure that Georgios gets in touch with Luc to advance discussions on modelling collocation hints and assessing their usefulness.

 

Actions follow-up:

  • Repacking of disabled tapes: Currently, CTA does not honour the disabled flag when recalling tapes. Disabling a tape however means that neither read nor write access shall be granted to users. Therefore, the CTA behaviour has to be changed and a solution needs to be found for repack. Eric, Cédric and Vlado to discuss and agree on this.
  • CTA web page: We urgently need a web page for the CERN CTA instances with a short description and links to monitoring (action on Julien). 
Actions
who what by when
German Storage classes definition with Alessandro D.G. 25/5
Eric Agree with ATLAS on list of "activities" and configure via cta-admin. Deploy "activities" on ATLAS 27/5
Eric  Deploy FIFO queueing on ATLAS 3/6
Julien CTA web site - add CERN instance description and links to monitoring (urgent) 24/5
Michael Complete (with Cédric S.) namespace split-up 30/5
Vlado, Eric Agree how to repack disabled tapes once disabling is honored by CTA 30/5
Eric contact ATLAS in order to agree on list of activities and exact details 30/5
Eric specify exact URL parameters to use for setting the class for each file to be transferred 27/5
Julien follow up with ATLAS on how deletions should take place. Check with CMS whether files can be removed for easing schema upgrade. 27/5
Julien Ensure CTA team is copied in exchanges with ATLAS and other experiments. 24/5
Julien Talk to Bernd / procurement to obtain precise timelines. 30/5
Michael Ensure that Georgios gets in touch with Luc to advance discussions on modelling collocation hints and assessing their usefulness. 30/5
There are minutes attached to this event. Show them.
    • 14:00 14:40
      ATLAS next steps 40m

      Activities:
      - Final decision on how should ATLAS pass "activity" and how is this propagated via FTS -> EOS -> CTA (Eric, Andrea)
      - Initial "activity" list proposal by Alessandro: "grid", "t0" (to be confirmed with ATLAS DDM/T0 - Eric)

      Storage classes:
      - Agreed on O(10) storage classes with Alessandro (who will send us the names)
      - Final decision on how should ATLAS pass storage classes and how is this propagated via FTS -> EOS -> CTA (Eric, Andrea)

      Migration status (Michael)
      - Conclusion on NS split with Cédric S.

      • ATLAS instance (Julien)
      • Setup status
      • ATLAS cleanup and data sending status

      Switchover checklist
      - September is approaching! Are we ready?

      Collocation hints:
      - How to approach the "R&D track"

    • 14:40 15:00
      Review of actions, AOB, items for next meeting 20m