5–9 May 2008
CERN
Europe/Zurich timezone

Session

Data centre management, availability and reliability

D
6 May 2008, 14:00
503/1-001 - Council Chamber (CERN)

503/1-001 - Council Chamber

CERN

Route de Meyrin CH-1211 Genève 23 Switzerland
162
Show room on map

Presentation materials

There are no materials yet.

  1. Gary Stiehr (The Genome Center at Washington University)
    06/05/2008, 14:00
    Data centre management, availability, and reliability
    Over the last couple of years, The Genome Center at Washington University in St. Louis has been involved with the planning and construction of a new data center. We will provide updates since our data center presentation at HEPiX Fall 2007 in St. Louis. In addition, we will share our experiences and lessons learned as we prepare to move into the new data center in May 2008.
    Go to contribution page
  2. Stefan Haller (GSI)
    06/05/2008, 14:30
    Data centre management, availability, and reliability
  3. Wim Heubers (NIKHEF)
    06/05/2008, 15:00
    Data centre management, availability, and reliability
    Extension of the NIKHEF/SARA data centre
    Go to contribution page
  4. Arne Wiebalck (CERN)
    06/05/2008, 16:00
    Data centre management, availability, and reliability
    CERN's AFS installation serves between 1 and 2 billion accesses per day to its around 20'000 users. Keeping track of the system's overall status and trying to find problems before the users do is not a trivial task, esp. as the installation is growing in almost all aspects. This talk will present CERN's AFS Console, a Lemon- and web-based monitoring tool used by the AFS...
    Go to contribution page
  5. Tony Chan (Brookhaven National Laboratory)
    06/05/2008, 16:30
    Data centre management, availability, and reliability
    This presentation provides an update on the status of the new Data Center to support the ATLAS Tier 1 Center and RHIC Computing at Brookhaven. A brief discussion provides details of the new facility to Brookhaven, as well as timelines for availability to both the ATLAS and RHIC programs. Some of our experiences described in this presentation will also be beneficial to other sites who are...
    Go to contribution page
  6. Tony Chan (Brookhaven National Laboratory)
    06/05/2008, 16:50
    Data centre management, availability, and reliability
    The RACF provides computing support to a broad spectrum of programs at Brookhaven. The growth of the facility, the varying needs of the scientific programs and the necessity for distributed computing requires the RACF to change from a system to a service-based SLA with our end users. This presentation describes the adjustments made by the RACF to transition to a service-based SLA,...
    Go to contribution page
  7. Sebastian Lopienski (CERN)
    06/05/2008, 17:10
    Data centre management, availability, and reliability
    Managing large clusters that host complex services has particular challenges. Operations like checking configuration consistency, running some actions on node or nodes, moving them between clusters etc. are very frequent. When scaling up to running thousands of CPU and STORAGE nodes in order to meet LHC requirements some of these challenges are becoming more evident. These scaling challenges...
    Go to contribution page
  8. Eric Grancher (CERN)
    06/05/2008, 17:40
    Data centre management, availability, and reliability
    Problem tracking at CERN
    Go to contribution page
Building timetable...