System testing cloud services using
EOS + CTA development use-case
Julien Leduc from IT STorage group CERN
Data Archiving at CERN
- Ad aeternum storage
- 7 tape libraries, 83 tape drives, 20k tapes
- Current use: 180 PB
- Current capacity: 0.6 EB
- Exponentially growing
Data Archiving at CERN Evolution
- EOS + tapes...
- EOS is CERN strategic storage platform
- tape is the strategic long term archive medium
- EOS + tapes = ♥
- Meet CTA: CERN Tape Archive
- Streamline data paths, software and infrastructure
- CTA is glued to the back of EOS
- EOS manages CTA tape files as replicas
- CTA contains a catalogue of all tape files
- CTA provides optimised, preemptive scheduling
CTA development timeline
- End 2016: First functional prototype release
- April 2017: First release for additional copy use cases
- 2018: Production-ready version
Easy migration path from CASTOR to EOS+CTA: only metadata need to be migrated CASTOR tape format will be reused.
CTA + EOS developments
This involves tightly coupled development in the intial phase for both software, and extensive testing to quickly catch regressions.
CASTOR integration tests
- Easy situation:
- all components are within one git repository
Puppet
deploys development instances on VMs
- Limited external dependencies per instance: 1 database, 1 virtual tape library
CASTOR integration tests
- But several issues:
- deploying a developer instance from scratch takes loooonnng time...
- code changes in CASTOR often require Puppet manifest change
- real tape hardware tests are way further down the road in separate hostgroups, environments...
- which implies ad hoc developer tests...
CTA+EOS integration tests
- Complex situation:
- 2 distinct software projects
- More external dependencies per instance: 1 database, 1 virtual tape library, 1 objectstore
CTA+EOS integration tests
- How to fix everything?
- I am lazy and impatient
- no manual operation → CI
- make it fast
- Must allow similarly easy beta testing deployments for administrators/users (simple and bulletproof)
- How to test real tape hardware?
CTA CI
Implemented in CERN Gitlab instance
- Build software: CTA RPMs available as artifacts
- Build and publish a generic Docker image in gitlab registry
- Contains all required RPMs for instantiation (CTA artifacts, specific EOS version, specific XROOTD version)
- Run system tests in custom
kubernetes
cluster
Basic kubernetes concepts
kubernetes resources
System tests on dedicated kubernetes clusters
- One Puppet deployed kubernetes cluster per developer on one VM
- Kubernetes resources per cluster:
- 1 Oracle database (+ unlimited sqlite accounts)
- 1 Ceph objectstore (+ unlimited local objectstores)
- 10 Virtual tape libraries: 2 tape drives, 10 tapes
Instantiating a test
- Create k8 Namespace
- Instantiate all Services in the namespace
- Consumable resources are implemented as Persistent Volumes
- Issue a Persistent Volume Claim with selector
- Instantiate associated Configuration in the Namespace
- Instantiate all the Pods with their associated containers to implement all the services
- Wait for all the pods to be ready
Instantiating a test
Real tape drive tests
- Deploy Puppet manifest on real hardware
- Add physical tape library resources in hiera
- Increase timeouts for system tests
VoilĂ !
We can deploy the same kubernetes instance on real tape hardware and run exactly the same system tests.
THE END
- Very powerful approach addresses and federates all our use cases
- Fast, flexible, isolated and self contained in software repository
TO DO
- Evangelise
- Write and structure more system tests
- Bulletproofing reproducibility for regression tests
- Evaluate possible production use ☺