HTCondor Workshop Autumn 2020

Name: HTCondor Workshop Autumn 2020
Start: 2020-09-21T14:00:00+02:00
End: 2020-09-25T18:30:00+02:00
Location: (teleconference only)

21–25 Sept 2020

(teleconference only)

Europe/Paris timezone

Support

hepix-2020condorworkshop-support@hepix.org

HTCondor monitoring at ScotGrid Glasgow

24 Sept 2020, 17:40

20m

https://cern.zoom.us/j/97987309455

HTCondor user presentations Workshop session

Emanuele Simili (University of Glasgow)

Our Tier2 cluster (ScotGrid, Glasgow) uses HTCondor as batch system, combined with ARC-CE as front-end for job submission and ARGUS for authentication and user mapping.
On top of this, we have built a central monitoring system based on Prometheus that collects, aggregates and displays metrics on custom Grafana dashboards. In particular, we extract jobs info by regularly parsing the output of 'condor_status' on the condor_manager, scheduler, and worker nodes.
A collection of graphs gives a quick overlook of cluster performance and helps identify rising issues. Logs from all nodes and services are also collected to a central Loki server and retained over time.

Desired slot length	15
Speaker release	Yes

Emanuele Simili (University of Glasgow)

David Britton Samuel Cadellin Skipsey Gordon Stewart (University of Glasgow) Gareth Douglas Roy (University of Glasgow (GB))

HTCondor @ ScotGrid Glasgow Cluster Monitoring

Recording

HTCondor Workshop Autumn 2020

Support

HTCondor monitoring at ScotGrid Glasgow

https://cern.zoom.us/j/97987309455

Speaker

Description

Author

Co-authors

Presentation materials