CHEP 07

Name: CHEP 07
Start: 2007-09-02T08:00:00+02:00
End: 2007-09-09T12:00:00+02:00
Location: Victoria, Canada

2–9 Sept 2007

Victoria, Canada

Europe/Zurich timezone

Please book accomodation as soon as possible.

Support

chep07-support@triumf.ca

Remote Management of nodes in the ATLAS Online Processing Farms

3 Sept 2007, 08:00

10h 10m

Victoria, Canada

Board: 21

poster Online Computing Poster 1

Dr Marc Dobson (CERN)

The ATLAS experiment will use of order three thousand nodes for the online processing farms. The administration of such a large cluster is a challenge especially due to high impact of any down time. The ability to quickly and remotely turn on/off machines, especially following a power cut, and the ability to monitor the hardware health whether the machine be on or off are some of the major issues which the ATLAS SysAdmin Team faced. To solve these problems ATLAS has decided wherever possible to use Intelligent Platform Management Interfaces (IPMI) for its nodes. This paper will present the mechanisms which were developed to allow the distribution of management and monitoring commands to the cluster machines in parallel. These commands were run simultaneously on the prototype farm and on the small scale final farm already purchased. The commands and their distribution take into account the specificities of the different IPMI versions and implementations, and the network topology of the ATLAS Online system. Results from timing measurements for the distribution of commands to many nodes will be shown. These measurements will cover the times for booting and for shutting down of the nodes and will be extrapolated to the final cluster size.

Dr Marc Dobson (CERN) Dr Usman Ahmad MALIK (NCP, Quaid-E-Azam University)

Poster

poster_437.pdf.pdf

CHEP 07

Support

Remote Management of nodes in the ATLAS Online Processing Farms

Victoria, Canada

Speaker

Description

Authors

Presentation materials

Choose timezone

CHEP 07

Support

Speaker

Description

Authors

Presentation materials