-
Gary Stiehr (The Genome Center at Washington University)06/05/2008, 14:00Data centre management, availability, and reliabilityOver the last couple of years, The Genome Center at Washington University in St. Louis has been involved with the planning and construction of a new data center. We will provide updates since our data center presentation at HEPiX Fall 2007 in St. Louis. In addition, we will share our experiences and lessons learned as we prepare to move into the new data center in May 2008.Go to contribution page
-
Stefan Haller (GSI)06/05/2008, 14:30Data centre management, availability, and reliability
-
Wim Heubers (NIKHEF)06/05/2008, 15:00Data centre management, availability, and reliabilityExtension of the NIKHEF/SARA data centreGo to contribution page
-
Arne Wiebalck (CERN)06/05/2008, 16:00Data centre management, availability, and reliabilityCERN's AFS installation serves between 1 and 2 billion accesses per day to its around 20'000 users. Keeping track of the system's overall status and trying to find problems before the users do is not a trivial task, esp. as the installation is growing in almost all aspects. This talk will present CERN's AFS Console, a Lemon- and web-based monitoring tool used by the AFS...Go to contribution page
-
Tony Chan (Brookhaven National Laboratory)06/05/2008, 16:30Data centre management, availability, and reliabilityThis presentation provides an update on the status of the new Data Center to support the ATLAS Tier 1 Center and RHIC Computing at Brookhaven. A brief discussion provides details of the new facility to Brookhaven, as well as timelines for availability to both the ATLAS and RHIC programs. Some of our experiences described in this presentation will also be beneficial to other sites who are...Go to contribution page
-
Tony Chan (Brookhaven National Laboratory)06/05/2008, 16:50Data centre management, availability, and reliabilityThe RACF provides computing support to a broad spectrum of programs at Brookhaven. The growth of the facility, the varying needs of the scientific programs and the necessity for distributed computing requires the RACF to change from a system to a service-based SLA with our end users. This presentation describes the adjustments made by the RACF to transition to a service-based SLA,...Go to contribution page
-
Sebastian Lopienski (CERN)06/05/2008, 17:10Data centre management, availability, and reliabilityManaging large clusters that host complex services has particular challenges. Operations like checking configuration consistency, running some actions on node or nodes, moving them between clusters etc. are very frequent. When scaling up to running thousands of CPU and STORAGE nodes in order to meet LHC requirements some of these challenges are becoming more evident. These scaling challenges...Go to contribution page
-
Eric Grancher (CERN)06/05/2008, 17:40Data centre management, availability, and reliabilityProblem tracking at CERNGo to contribution page
Choose timezone
Your profile timezone: