21-25 May 2012
New York City, NY, USA
US/Eastern timezone

Artificial Intelligence in the service of system administrators

22 May 2012, 15:10
Room 802 (Kimmel Center)

Room 802

Kimmel Center

Parallel Software Engineering, Data Stores and Databases (track 5) Software Engineering, Data Stores and Databases


Christophe Haen (Univ. Blaise Pascal Clermont-Fe. II (FR))


The LHCb online system relies on a large and heterogeneous IT infrastructure made from thousands of servers on which many different applications are running. They run a great variety of  tasks : critical ones such as data taking and secondary ones like web servers. The administration of such a system and making sure it is working properly represents a very important workload for the  small expert-operator team. Research has been performed to try to automatize (some) system administration tasks, starting in 2001 when IBM defined the so-called “self objectives” supposed to lead to “autonomic computing”. In this context, we present a framework that makes use of artificial intelligence and machine learning to monitor and diagnose at a low level and in a non intrusive way  Linux-based systems and their interaction with software. Moreover, the multi agent approach we use, coupled with a "object oriented paradigm" architecture should increase a lot our learning speed, and highlight relations between problems.
Primary author

Christophe Haen (Univ. Blaise Pascal Clermont-Fe. II (FR))


Enrico Bonaccorsi (CERN) Niko Neufeld (CERN) Prof. Vincent BARRA (LIMOS, UMR 6158 CNRS, Univ. Blaise Pascal)

Presentation Materials