CTA deployment meeting

Europe/Zurich
600/R-001 (CERN)

600/R-001

CERN

4
Show room on map
Michael Davis (CERN)
    • 14:00 14:10
      ALICE Post-migration 10m
      • Namespace reconciliation: Michael provided Costin with list of 2m files in Alien which are not in CTA.
      • ALICE probe status (#105 and #666).
      • Implicit prepare status (implemented, needs a CTA release).
      • Garbage collector status (#905), needs feedback from Andreas. GC will be needed mid-November.
    • 14:10 14:20
      Getting CMS into Production 10m

      TO DO

      • Test EOS Backup (Elvin's daemon and user backup tools on EOS CMS) (#113)
      • Repack tapes in cms_user (following list in CASTOR /user directories split across multiple file classes). Not a blocker if this is not finished prior to migration, we could remove tapes needing to be repacked from r_cms_user and migrate them after repack.
      • Test archive monitoring (either FTS or homebrew using xrdfs query prepare)
      • OTGs: CASTOR and CTA

      Schedule

      • 28 Oct 2020: CTA recall test (not a scale test, recall some datasets from various eras of the experiment)
      • 9 Nov 2020: End-to-end functional "replay" tests at T0 (pre-MWGR tests, with CASTOR)
      • 18 Nov 2020: MWGR#4 wiith CASTOR
      • w/c 23 Nov 2020 Put CASTOR CMS into recall-only mode
      • 30 Nov 2020: Migrate to CTA
      • 07 Dec 2020: CMS in production
      • End Dec 2020? : CMS retires PhEDEx
    • 14:20 14:30
      Getting LHCb into Production 10m
      • See Putting EOSCTALHCB into Production (CodiMD)
      • Successful 200TB test with 10 tape drives and one buffer server for 2.5GB/s of constant archival throughput.
      • Successful transfer using Dirac to move a file from EOS to Gridka via HTTP (FTS monitoring)!

      TO DO/Schedule

      • w/c 2 Nov : 200 TB recall test
      • Test HTTP TPC with CTA (FTS multi-hop with one "hop" as QoS change, one hop as the transfer)
      • DAQ test should be possible in November (tentatively 9 Nov 2020)
    • 14:30 14:40
      Getting PUBLIC into Production 10m

      NA62

      • NA62 repack is proceeding
      • NA62 offline integration: Vova has created the VM with AFS+EOS to allow the tests to proceed
      • NA62 has been migrated onto EOSCTA PPS (#72)
      • w/c 9 Nov : NA62 recall tests (30 TB) on EOSCTA PPS

      COMPASS

      • Metadata check checksum bulk query: they will do it the slow way for now
      • Wed 4 Nov 2020 : DAQ tests on CTA.
      • EOSCTA instance for these tests, see #69 COMPASS tests on EOSCTAPUBLIC PPS

      AMS

      • Meeting with AMS took place 29/09/2020
      • They have an endpoint
      • Need them to test it

      NA61/SHINE

      • Trying to schedule a meeting

      Identifying hot/cold data

      • Vova is looking at CASTOR logfiles to try and build a profile of when data was last accessed and by whom.
    • 14:40 14:50
      Repack 10m

      Repack of public_user

      • Many experiments have data under their part of the namespace in the public_user fileclass. This prevents us from migrating small experiments one-by-one.

      • public_user needs to be split into "real" user files, small/medium experiment files and legacy experment data

      • Need to create new tapepools with archive routes etc.
      • Around 1,500 tapes need to be repacked

      CTA Repack: status update

      1. Deploy and test: (a) Injection of recovered files when repacking broken tapes, (b) Change mount policy on-the-fly.
      2. Tape Server: Add configuration options to give greater control over the highly-distributed maintainence process.
      3. Remove "superseded" files in favour of the recycle bin.
      4. Investigate if we can run a separate repack instance sharing the same catalogue.
    • 14:50 14:55
      AOB 5m