10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

ALICE HLT Cluster operation during ALICE Run 2

10 Oct 2016, 12:00
15m
Sierra B (San Francisco Mariott Marquis)

Sierra B

San Francisco Mariott Marquis

Oral Track 6: Infrastructures Track 6: Infrastructures

Speaker

Johannes Lehrbach (Johann-Wolfgang-Goethe Univ. (DE))

Description

ALICE HLT Cluster operation during ALICE Run 2

(Johannes Lehrbach) for the ALICE collaboration

ALICE (A Large Ion Collider Experiment) is one of the four major detectors located at the LHC at CERN, focusing on the study of heavy-ion collisions. The ALICE High Level Trigger (HLT) is a compute cluster which reconstructs the events and compresses the data in real-time. The data compression by the HLT is a vital part of data taking especially during the heavy-ion runs in order to be able to store the data which implies that reliability of the whole cluster is an important matter.

To guarantee a consistent state among all compute nodes of the HLT cluster we have automatized the operation as much as possible. For automatic deployment of the nodes we use Foreman with locally mirrored repositories and for configuration management of the nodes we use Puppet. Important parameters like temperatures of the nodes are monitored with Zabbix.

During periods without beam the HLT cluster is used for tests and as one of the WLCG Grid sites to compute offline jobs in order maximize the usage of our cluster. To prevent interference with normal HLT operations we introduced a separation via virtual LANs between the normal HLT operation and the grid jobs running inside virtual machines.

Primary Keyword (Mandatory) Computing facilities
Secondary Keyword (Optional) High performance computing

Primary author

Johannes Lehrbach (Johann-Wolfgang-Goethe Univ. (DE))

Presentation materials