21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Artificial Intelligence in the service of system administrators

22 May 2012, 15:10
25m
Room 802 (Kimmel Center)

Room 802

Kimmel Center

Parallel Software Engineering, Data Stores and Databases (track 5) Software Engineering, Data Stores and Databases

Speaker

Christophe Haen (Univ. Blaise Pascal Clermont-Fe. II (FR))

Description

The LHCb online system relies on a large and heterogeneous IT infrastructure made from thousands of servers on which many different applications are running. They run a great variety of  tasks : critical ones such as data taking and secondary ones like web servers. The administration of such a system and making sure it is working properly represents a very important workload for the  small expert-operator team. Research has been performed to try to automatize (some) system administration tasks, starting in 2001 when IBM defined the so-called “self objectives” supposed to lead to “autonomic computing”. In this context, we present a framework that makes use of artificial intelligence and machine learning to monitor and diagnose at a low level and in a non intrusive way  Linux-based systems and their interaction with software. Moreover, the multi agent approach we use, coupled with a "object oriented paradigm" architecture should increase a lot our learning speed, and highlight relations between problems.
Student? Enter 'yes'. See http://goo.gl/MVv53 yes

Primary author

Christophe Haen (Univ. Blaise Pascal Clermont-Fe. II (FR))

Co-authors

Enrico Bonaccorsi (CERN) Niko Neufeld (CERN) Prof. Vincent BARRA (LIMOS, UMR 6158 CNRS, Univ. Blaise Pascal)

Presentation materials