CHEP 2016 Conference, San Francisco, October 8-14, 2016

Name: CHEP 2016 Conference, San Francisco, October 8-14, 2016
Start: 2016-10-10T08:00:00-07:00
End: 2016-10-14T18:00:00-07:00
Location: San Francisco Marriott Marquis

10–14 Oct 2016

San Francisco Marriott Marquis

America/Los_Angeles timezone

Real-time complex event processing for cloud resources

13 Oct 2016, 15:30

1h 15m

San Francisco Marriott Marquis

Poster Track 2: Offline Computing Posters B / Break

Cristovao Cordeiro (CERN)

The ongoing integration of clouds into the WLCG raises the need for a detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to a metric overflow whereby the operators need to manually collect and correlate data from several monitoring tools and frameworks, resulting in tens of different metrics to be interpreted and analysed per virtual machine, constantly.

In this paper we present an ESPER based standalone application which is able to process complex monitoring events coming from various sources and automatically interpret data in order to issue alarms upon the resources' statuses, without interfering with the actual resources and data sources. We will describe how this application has been used with both commercial and non-commercial cloud activities, allowing the operators to quickly be alarmed and react upon VMs and clusters running with a low CPU load and low network traffic, among other anomalies, resulting then in either the recycling of the misbehaving VMs or fixes on the submission of the LHC experiments workflows. Finally we'll also present the pattern analysis mechanisms being used as well as the surrounding Elastic and REST API interfaces where the alarms are collected and served to users.

Primary Keyword (Mandatory)	Artificial intelligence/Machine learning
Secondary Keyword (Optional)	Monitoring

Cristovao Cordeiro (CERN)

Domenico Giordano (CERN) Laurence Field (CERN) Luca Magnoni (CERN) Martin Adam (Acad. of Sciences of the Czech Rep. (CZ))

Highlights-27.pdf

Poster-27.pdf

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Real-time complex event processing for cloud resources

San Francisco Marriott Marquis

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Speaker

Description

Author

Co-authors

Presentation materials