Fallout from ATLAS Data Carousel

  • GC fix (Steve):
    • FST aka "slow" GC fix is progressing. The "slow" GC removes files exceeding a configurable age without ordering. (The MGM "fast" GC is LRU and the migration GC evicts immediately after migration)
    • Steve has discussed with Elvin/Luca how to identify which files that have been completely read (ie. openened for reading, then closed). This would require substantial work (needs to be done in MGM but cleanup be issued on FST)
  • Retrieve backstop: 
    • Proposal was discussed; Eric created a gitlab ticket with full details (#533, link)
  • FTS cache/buffer removal:
    • requires an XROOT "un"-prepare method to do buffer cleanup (equivalent to "eos stagerrm").
    • Once "un-prepare" is available, FTS development latency is estimated between 1-2 weeks.
    • In principle, FTS would send this to all storage back-ends on the assumption this is ignored by non-CTA storage systems - this needs careful examination by FTS experts.
      • Note that providing storage specific back-ends is becoming increasingly relevant for FTS, as VO's such as ATLAS would like to pass metadata (storage class information, retrieve activities, collocation hints) via FTS to the back-ends.
    • How does FTS process file requests? While it can receive 100K transfer requests at once, it will process them in order (chunks of 5000 files). What happens with files that are pending due to e.g. a problematic tape or library? A timeout (900s) will kick in and files will be retried at a later stage.
  • Others:
    • FTS production instance was used for recalls (instead of the -dev one) which was causing #streams to fall down after failures. Julien will check with FTS team to ensure the right FTS version is used for CTA activities for archivals and retrieves.


Action list:

who what by when
Eric Agree with ATLAS on list of "activities" and configure via cta-admin. Deploy "activities" on ATLAS 27/5
Michael Complete (with Cédric S.) namespace split-up 30/5
Cédric Implement repacking taking into account disabled tapes and drive dedications 30/5
Julien Ensure CTA team is copied in exchanges with ATLAS and other experiments. 24/5
Julien talk to procurement and network people (to ensure all network infrastructure is in place when nodes arrive) 30/5
Michael Ensure that Georgios gets in touch with Luc to advance discussions on modelling collocation hints and assessing their usefulness. 30/5
Julien/Andrea Explict stager_rm follow-up 13/6
Andrea Agree Rucio->FTS metadata format for collocation hints and storage classes  13/6
Eric propose and discuss with FTS team format how to receive collocation hints (in addition to storage classes and activities) from FTS. 13/6
Julien Identify what is the right hardware to run migration 13/6


    • 10:30 10:50
      Fallout from ATLAS Data Carousel 20m

      Status and timelines for:
      - GC fix
      - CTA retrieve backstop
      - FTS cache removal
      - Next steps from ATLAS

    • 10:50 11:10
      Review of actions, AOB, items for next meeting 20m