21-25 May 2012
New York City, NY, USA
US/Eastern timezone

Hunting for hardware changes in data centers.

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Computer Facilities, Production Grids and Networking (track 4) Poster Session

Speaker

Miguel Coelho Dos Santos (CERN)

Description

With many servers and server parts the environment of warehouse sized data centers is increasingly complex. Server life-cycle management and hardware failures are responsible for frequent changes that need to be managed. To manage these changes better a project codenamed "hardware hound" focusing on hardware failure trending and hardware inventory has been started at CERN. By creating and using a hardware oriented data set - the inventory - with detailed information on servers and their parts, firmware levels, and other server related data, e.g. rack location, benchmarked processing performance and power consumption, warranty coverage, purchase order, deployment state (production, maintenance), etc; as well as tracking changes to this inventory, the project aims at, for example, being able to discover trends in hardware failure rates, e.g. lower mean time to failure of a given component in a given batch of servers. This contribution will describe the architecture of the project, the inventory data, and real life use cases.

Primary author

Co-authors

Presentation Materials