CTA deployment meeting

Europe/Zurich
600/R-001 (CERN)

600/R-001

CERN

15
Show room on map
Michael Davis (CERN)

Why we may need RAIN on SSDs

  1. Handling files bigger than a single SSD
  2. We don’t have to use the converter to convert file layouts (though we still have to use it to change spaces)

Debrief on the ALICE CTA+RAIN tests

RAIN layout

Most of the problems we experienced were not specific to RAIN, they were caused by using the EOS converter. One RAIN-specific problem is that when recalling from tape to RAIN, ls -y gives d10::t1 to indicate there is one file copy with 10 stripes, not 10 replicas of the file.

What are the use cases for conversion?

The EOS devs did not think we would use the converter. But there are several cases where we do need to use it:

  1. changing the disk layout (single to RAIN)
  2. changing the space (SSD to spinner)
  3. group rebalancing (changing number of groups)
  4. See also EOS converters and file identifiers in the EOSCTA docs site

Note: some new EOS developments and recent discussions with the EOS team mean that we will be able to further limit the cases where we need to use the converter. To bew reviewed.

Problems we need to investigate

  • When we created 2 layout with 2 replicas, this created 2 tape copies. We need to investigate what happens when we archive with RAIN: how many tape copies are created?
  • Space policy: There are several ways to specify which space name a file will land on: default space; space specified by URL; per-directory space specified by sys.forced.space.
  • Performance issues: we have observed a long latency after closing a file after writing. Was this caused by bad or mismatched disk servers or something else?
  • Solution to the problem of copying files from tape to RAIN. Do we need SSDs? If we do, do we need to use the converter? (Julien would like to isolate performance of tape drive from performance of disk system).
  • Querying of free space: do file system statistics work as we expect on a RAIN layout?
  • Will Mihai’s rewrite of converter code still change the disk file ID? (Answer: the converter code was refactored but it still changes the file ID).

Problems we know how to solve

Conversion changes disk file IDs, but we rely on disk file IDs to get file metadata from the EOS namespace. Two possible solutions:

  • EOS keeps an index of archive file IDs to disk file IDs, we can forget the disk file ID.
  • EOS can change the file IDs, they send us an event to update our catalogue with the new disk file ID.

Preference from the EOS devs was solution #2. However even if this is a synchronous event, it could still get out of sync if it fails after we update our DB but before the sync message gets back to the MGM. Such cases should be fixed by a retry. In some rare cases we may have to search back in the logs to get the correct disk file ID.

Problems we need to find a solution to

  • Conversion fires the DELETE workflow (FIXED)
  • When a new file is created by the converter, we lose the tape file system tag (65535). Can’t prepare evict. (FIXED)
  • When will the Converter work of Mihai be merged into the master branch? (Currently in testing, will be merged shortly)
  • Recall onto RAIN, we get d10::t1
  • Configure conversion on destination space and number of threads, has to restart MGM to process the conversion jobs.
    We are not sure where to switch converter on (target space?)
  • How to get the list of failed conversion jobs? (without having to parse the converter logfile)

What is the roadmap for EOS converter in the next few weeks?

There are minutes attached to this event. Show them.
    • 14:00 14:10
      EOSCTAALICE recall campaign 10m

      We need to revise our plan in view of the following developments:

      • Latchezar says the data staging plan we discussed is no longer valid: "do not stage the Pb-Pb data, as we will need other pieces of p-p data staged first... not all will fit on the CTA buffer"
      • This means that we no longer have a scenario of "stage once and leave ALICE to it". There will be a need to stage the Pb-Pb data in later. Will ALICE take care of this themselves?
      • There is another set of data which needs to be staged on the CASTOR stager. Is this also RAW data/does it overlap with the datasets above?
    • 14:10 14:30
      Debrief on what we have learned in the last 2 weeks 20m

      Brainstorming note on CodiMD

      • List EOS behaviours/new features which break CTA for (a) archival use cases (single replica/RAIN), (b) retrieve use cases (single-replica/RAIN), (c) other undesired behaviour (triggering the delete workflow when a file is converted!)
      • Categorize by problems we know the solution for and problems we need to find a solution for.
      • Shortlist problems to go to next week's X-Section meeting
      • Strategy to address wider issue of communication within the group.
    • 14:30 14:35
      AOB 5m
      • Paul is ready to begin his first set of RAO tests as soon as a drive can be made available.