CHEP 2018 Conference, Sofia, Bulgaria

Name: CHEP 2018 Conference, Sofia, Bulgaria
Start: 2018-07-09T08:00:00+03:00
End: 2018-07-13T13:00:00+03:00
Location: Sofia, Bulgaria

9–13 Jul 2018

Sofia, Bulgaria

Europe/Sofia timezone

Contact us

Quasi-online accounting and monitoring system for distributed clouds

11 Jul 2018, 11:30

15m

Hall 10 (National Palace of Culture)

Hall 10

National Palace of Culture

presentation Track 7 – Clouds, virtualization and containers T7 - Clouds, virtualization and containers

Randy Sobie (University of Victoria (CA))

The HEP group at the University of Victoria operates a distributed cloud computing system for the ATLAS and Belle II experiments. The system uses private and commercial clouds in North America and Europe that run OpenStack, Open Nebula or commercial cloud software. It is critical that we record accounting information to give credit to cloud owners and to verify our use of commercial resources. We want to record the number of CPU-hours of the virtual machine.
To obtain the required information, we run a fast benchmark at boot time to give an estimate of the HEPSpec06 units of the node. Our first system, writes the benchmark and CPU times (obtained from /proc/stat) to a log file every 15 minutes. The last entry of the VM was used to determine the CPU-hours. This system has work well but the information about a VM is only available after it is deleted and, in some cases, VMs can exist for many weeks. Hence, the final accounting information is delayed for some time.
We have introduced a new system that continuously collects the information and uploads it into a an Elastic Search database. The information is processed and published as soon as it is available. The data is published in tables and plots in Kibana and Root. We have found the system to be useful beyond gathering accounting information and can be used for monitoring and diagnostic purposes. For example, we can use it to detect if the payload jobs are stuck in a waiting state for external information.
We will report on the design and performance of the system, and show how it provides important accounting and monitoring information on a large distributed system.

Randy Sobie (University of Victoria (CA)) Rolf Seuster (University of Victoria (CA)) Marcus Ebert (University of Victoria) Colson Driemel (University of Victoria) Colin Roy Leavett-Brown (University of Victoria (CA)) Kevin Casteels (University of Victoria (CA)) Michael Paterson (U) Frank Berghaus (University of Victoria (CA))

Sobie-CHEP.pdf

CHEP 2018 Conference, Sofia, Bulgaria

Contact us

Quasi-online accounting and monitoring system for distributed clouds

Hall 10

National Palace of Culture

Speaker

Description

Authors

Presentation materials

Choose timezone

CHEP 2018 Conference, Sofia, Bulgaria

Contact us

Speaker

Description

Authors

Presentation materials