Jan van ELDIK (CERN)
This paper presents work, both completed and planned, for streamlining the deployment, operation and re-tasking of Castor2 instances. We present a summary of what has recently been done to reduce the human intervention necessary for bringing systems into operation; including the automation of Grid host certificate requests and deployment in conjunction with the CERN Trusted CA and automated configuration using Quattor. We provide an overview of the software developed for monitoring operations so that various types of problem are quickly identified and remedied. Many of these tasks have been automated in a portable manner so that they can be used by other sites running Castor. To aid in taking diskservers out of production, for hardware interventions or to retask the machine to another instance, we present the development of a program which can take machines out of production while ensuring that data is reliably replicated.
Castor, castorReconcile, disknodeShutdown, operations management