Indico celebrates its 20th anniversary! Check our blog post for more information!

Cat-A Interview - "Core Computing: CRAB Developer and Support Operator"

Europe/Zurich
Description

Job Description

CRAB is the tool used by CMS physicist to run custom data analysis applications on the distributed computing Grid. It is based on a client-server architecture with very thin clients communicating to a central server using RESTful http architecture. CRAB turns high level description of analysis workflows into a set of grid jobs, runs them with automatic error recovery and retrieves, moves and catalogs their output on user behalf. CRAB service acts as interface between the end users and the CMS global Submission Infrastructure based on a world-wide HTCondor pool. CRAB is mature product and current emphasis is on higher automation and longer time stability and maintenance. As part of that process the CRAB components are being migrated from manually operated servers to a set of containers dynamically orchestrated via Docker and Kubernetes.

The operator will receive training in the relevant tools and procedures and work in tight collaboration with the CMS CRAB experts and be able to reach as needed to other experts inside CMS and/or CERN IT. This job has a development and an operation side. On the development site the operator will be asked to take responsibility for current code maintenance, addressing known defects and contributing to code cleanup, refactoring and simplification for easier long term maintenance. The operator will take care of creating new, validated releases in Github and ensure that they are properly build for deployment. On the operation side, the operator will be responsible for correct deployment and operation of the services and collaborate in troubleshooting and solving problems. CRAB reports extensive information to the CERN data analytics cluster, which is currently exploited for monitoring dashboards. Depending on operator skill and interest that data can be accessed also in Elastic Search or HDFS via Kibana, Grafana or Spark to build more refined analytics and help guide operations and future directions.

The operator will get good working knowledge with using CRAB as a tool, and become familiar with the CRAB internal architecture and its code base. It will be possible and useful to contribute improvements and patches to the CRAB code. The operator will also become familiar with the underlaying HTCondor infrastructure and be part of the challenge of operating the largest world-wide condor pool in existance to become an expert in large-scale high-throughput job execution in Grid, Cloud, and other kinds of resources.

The agenda of this meeting is empty