CTA deployment meeting

Europe/Zurich
31/S-027 (CERN)

31/S-027

CERN

2
Show room on map
Michael Davis (CERN)
    • 16:00 16:10
      ATLAS recall exercise 10m
      • 2017 and 2018 data has been recalled.
      • Issues which have arisen during the test :
        • no blocking errors but system still needs babysitting to understand/resolve numerous problems
        • gfal bug created a lot of noise (now fixed)
        • root cause of some errors lost in the noise
        • instrumentation will be improved for 2019 recalls. We are benefitting from the pause between processing each run (a luxury we won't have in production)
        • some diagnostic and devops tools still missing (e.g. "cta-admin showqueues" does not show popped jobs)
      • We are diagnosing problems and providing help to the rest of the group (EOS and FTS teams). In many cases we are reading the source code and contributing the fix.
    • 16:10 16:20
      Putting EOSCTAATLAS into production 10m

      Milestones:

      • "CTA Release v1.1" 31 January
      • Complete recall test. CASTOR will be restored as the ATLAS endpoint to allow pending calibration data to be written to tape.
      • Write stress test: 24 February
        • do we need multi-hop for this? To be checked with Cédric.
      • Online integration test: (2 March)
      • One week "cool off" period with no writes to CASTOR, to ensure all files have made it to tape and to check that no further data is being written
      • ATLAS goes into production and CASTOR files are migrated: date provisionally 16 March (check this does not clash with ATLAS TDAQ milestone tests)
    • 16:20 16:30
      Communication 10m
      • Logo
      • Website
      • Upcoming talks:
        • EOS workshop: next week
        • ATLAS software week: 10-14 Feb
        • ITUM: 17 Feb
        • IT/ATLAS coordination meeting
    • 16:30 16:40
      Plans and staffing needs 10m
      • Responsibilities and knowledge sharing :
        • CTA software: Frontend / Catalogue / Tape Server / Objectstore
        • Devops: hardware / systems integration / monitoring
      • ALICE reprocessing: contention in the disk cache
      • Postgres
      • Staff resources
    • 16:40 16:50
      AOB 10m