What would it mean to go to production during the Covid-19 lockdown?
- Steve suggested that we could create a tool to import tapes written in CTA into CASTOR. This would give us the option of a full rollback. Not too hard to do and it reduces the risk of proceeding with the migration.
- Spectra library delivery is delayed, but this is not a blocker. We can load media into IBM Lib1.
- For DCS operators, there is not much change to their tools. They would need to learn cta-admin. Training could be done using videoconferencing/collaboration tools.
- Commissioning/acceptance testing with ATLAS SFO was eventually completed, but took longer than expected due to communication difficulties and availability of experts. No problems on the CTA side but the rate of archive requests from SFO was much lower than we hoped for. Julien wrote a detailed report.
- Previously we had 2 CTA instances, now we have 3 (Production, PPS and migration). PPS will be used for write tests and Migration will be used for migration tests for each experiment prior to migrating them to Production.
- The main thing is that the CTA instance configuration must be stable so we can share the information. We are converging on the final production configuration.
- Administering the CTA tape side is easy, administering the EOS buffer is a different story. Julien raised the idea that operating the CTA EOS instance could be shared with the experiment operations team. After discussion, we preferred the idea that a couple of people in the EOS disk operations team could train on CTA and give us support. In return Julien could help their team with issues such as hardware specs and procurement.
Conclusions
- There are no blocking issues to stop us going to production. Obviously in the current teleworking situation communication is more difficult and everything takes a little longer. We can proceed, but on a more conservative schedule.
- We need ATLAS to confirm that they are satisfied with the results of the SFO tests.
- We will proceed with the rest of the commissioning tests we had planned, these are less complicated in that we don't depend on anyone else.
- It is important that everyone knows how the instances are configured. Once Julien has the final production instance configuration he will document it.
- Approach EOS operations team to see if a couple of people could be trained in CTA operations.
- The "import CTA tapes to CASTOR" tool seems like a good idea that would let us all sleep easier.
Communication
- Zoom has been more reliable than Vidyo, however Julien had a poor connection and was not able to dial in to the meeting using the French phone no.
- Is it possible to get a CERN phone for Julien?
Atlas Repack
- As the schedule for moving ATLAS to production will be put back by at least a few weeks, Vlado will push ahead with repacking ATLAS on CASTOR as quickly as possible. The goal is to repack all of ATLAS within one month.
CTA Testing Status
- ATLAS SFO test was completed, we need to verify with ATLAS that they are now satisfied and we can go to production.
- Julien will continue with simultaneous write/recall/delete tests; dual-copy tape pool test; Tier-1 export test; "What happens when the buffer is full" test
- In parallel, Julien will proceed with setting up CMS testing
- GC is ready for ALICE. We need to fix issue #631 before we can start testing.
- No data has been sent to the PUBLIC test instance from nTOF or NA62. This is not on our critical path so we will not push them.
There are minutes attached to this event.
Show them.