HEPiX Spring 2008
→
Europe/Zurich
503/1-001 - Council Chamber (CERN)
Helge Meinhard
(CERN)
Description
The HEPiX meetings bring together IT system support engineers from the High Energy Physics (HEP) laboratories and institutes, such as BNL, CERN, DESY, FNAL, IN2P3, INFN, JLAB, NIKHEF, RAL, SLAC, TRIUMF and others. They have been held regularly since 1991, and are an excellent source of information for IT specialists. That's why they enjoy large participation also from the non-HEP organizations.
Support
Participants
-
-
1
Status report on LHC and the Experiments 503/1-001 - Council ChamberSpeaker: Jos Engelen (CERN)
-
2
Organisational matters 503/1-001 - Council ChamberSpeaker: Helge Meinhard (CERN)
-
Site reports 503/1-001 - Council Chamber
- 3
- 4
-
10:40
Coffee break
- 5
- 6
-
7
Status Report Finnish CMS-T2 HIP/CSCA report about implementation of dCache at CSC as a storage element for the CMS-T2 centre. Technical details about network, disk, monitoring and so on will be given. CSC provides for HIP some 100TB of disk storage for CMS-T2 and about 70 TB of disks with a tape-backend for the NDGF-Alice T1-centre. Internal and external network traffic is split on the network-level. The CE and the middleware situation is presented in less detail.Speaker: Christof Hanke (CSC ltd.)
- 8
-
12:30
Lunch break CERN restaurants
CERN restaurants
CERN
Route de Meyrin CH-1211 Genève 23 Switzerland -
Applications and operating systems 503/1-001 - Council Chamber
- 9
- 10
-
11
Lifecycle management of Windows desktop applicationsLifecycle management of Windows desktop applicationsSpeaker: Sebastien Dellabella (CERN)
-
15:30
Coffee break
- 12
- 13
-
Site reports 503/1-001 - Council Chamber
- 14
-
CPU technology 503/1-001 - Council Chamber
- 15
-
1
-
-
16
LCG Update 503/1-001 - Council ChamberSpeaker: Ian Bird (CERN)
-
Site reports 503/1-001 - Council Chamber
- 17
- 18
- 19
-
10:30
Coffee break
- 20
- 21
-
HEPiX "bazaar and thinktank" 503/1-001 - Council Chamber
-
22
InspireInspire is the project name of a new High Energy Physics information system which will integrate present databases and repositories to host the entire corpus of the HEP literature and become the reference HEP scientific information platform worldwide. It is a common project between CERN, DESY, FERMILAB and SLAC. It will empower scientists with new tools to discover and access the results most relevant to their research; enable novel text- and data-mining applications; deploy new metrics to assess the impact of articles and authors. In addition, it will introduce the Web2.0 paradigm of user-enriched content in the domain of sciences with community-based approaches to the peer-review process.Speaker: Tibor Simko (CERN)
-
23
Data management and offline computing for LCLSWith BaBar now switched off, SLAC is shifting its focus now to the Linac Coherent Light Source (LCLS). I will present the plans for LCLS with its detectors and DAQ systems and then talk about the envisioned data management and the ideas for the necessary offline computing.Speaker: Alf Wachsmann (SLAC)
-
22
-
12:40
Lunch break CERN restaurants
CERN restaurants
CERN
Route de Meyrin CH-1211 Genève 23 Switzerland -
Data centre management, availability and reliability 503/1-001 - Council Chamber
-
24
Updates and Experiences from the Genome Center's Data Center Construction ProjectOver the last couple of years, The Genome Center at Washington University in St. Louis has been involved with the planning and construction of a new data center. We will provide updates since our data center presentation at HEPiX Fall 2007 in St. Louis. In addition, we will share our experiences and lessons learned as we prepare to move into the new data center in May 2008.Speaker: Gary Stiehr (The Genome Center at Washington University)
- 25
- 26
-
15:30
Coffee break
-
27
AFS Monitoring: The CERN AFS ConsoleCERN's AFS installation serves between 1 and 2 billion accesses per day to its around 20'000 users. Keeping track of the system's overall status and trying to find problems before the users do is not a trivial task, esp. as the installation is growing in almost all aspects. This talk will present CERN's AFS Console, a Lemon- and web-based monitoring tool used by the AFS administrators at CERN to quickly identify problematic entities (servers, partitions, volumes etc.) and to assist them in solving the issues found.Speaker: Arne Wiebalck (CERN)
-
28
New Data Center at BNL -- Status UpdateThis presentation provides an update on the status of the new Data Center to support the ATLAS Tier 1 Center and RHIC Computing at Brookhaven. A brief discussion provides details of the new facility to Brookhaven, as well as timelines for availability to both the ATLAS and RHIC programs. Some of our experiences described in this presentation will also be beneficial to other sites who are considering expansion of their own facilities.Speaker: Tony Chan (Brookhaven National Laboratory)
-
29
A Service-Based SLA ModelThe RACF provides computing support to a broad spectrum of programs at Brookhaven. The growth of the facility, the varying needs of the scientific programs and the necessity for distributed computing requires the RACF to change from a system to a service-based SLA with our end users. This presentation describes the adjustments made by the RACF to transition to a service-based SLA, including changes to its monitoring, alarm notification and problem resolution policies at the facility.Speaker: Tony Chan (Brookhaven National Laboratory)
-
30
CluManManaging large clusters that host complex services has particular challenges. Operations like checking configuration consistency, running some actions on node or nodes, moving them between clusters etc. are very frequent. When scaling up to running thousands of CPU and STORAGE nodes in order to meet LHC requirements some of these challenges are becoming more evident. These scaling challenges are the basis for CluMan, a new cluster management tool being designed and developed at CERN.Speaker: Sebastian Lopienski (CERN)
- 31
-
24
-
16
-
-
32
A review of the current technical activities in the CERN openlab 503/1-001 - Council ChamberSpeaker: Sverre Jarp (CERN)
-
Storage technology 503/1-001 - Council Chamber
- 33
- 34
-
10:45
Coffee break
- 35
-
36
CASTOR Status and PlansThis presentation will cover the current status of the CASTOR mass storage system, both in terms of software status and deployment and performance. Near future developement plans will also be presented while the longer term view will be addressed in a separate presentation, by Dirk Duellmann.Speaker: Sebastien Ponce (CERN)
- 37
-
12:45
Lunch break CERN restaurants
CERN restaurants
CERN
Route de Meyrin CH-1211 Genève 23 Switzerland -
Storage technology 503/1-001 - Council Chamber
- 38
- 39
-
40
The unbearable slowness of tapesIt is still common for the lowest layer of data storage to be 'tape'. While individual tape devices apparently offer very good performance, this is only achieved in a limited set of circumstances, and the overall throughput can easily become very poor very quickly. How can this be avoided, if at all? Is it sufficient to just add more and more equipment? These difficulties will be discussed in the context of CERN's current CASTOR HSM version, and the need to 'carry LHC data forward' for many years.Speaker: Charles Curran (CERN)
-
15:30
Coffee break
- 41
-
42
Experience and Lessons learnt from running high availability databases on Network Attached StorageThe Database and Engineering Services Group of CERN's Information Technology Department provides the Oracle based Central Data Base services used in many activities at CERN. In order to provide High Availability and ease management for those services, a NAS (Network Attached Storage) based infrastructure has been set up. It runs several instances of the Oracle RAC (Real Application Cluster) using NFS as share disk space for RAC purposes and Data hosting. It is composed of two private LAN's to provide access to the NAS file servers and Oracle RAC interconnect, both using network bonding. NAS nodes are configured in partnership to prevent having single points of failure and to provide automatic NAS fail-over. This presentation describes that infrastructure and gives some advice on how to automate its management and setup using a Fabric Management framework such as Quattor. It also covers aspects related with NAS Performance and Monitoring as well Data Backup and Archive of such facility using already existing infrastructure at CERN.Speaker: Nilo Segura Chinchilla (CERN)
- 43
-
32
-
-
44
The (WLCG) Common Computing Readiness Challenge (CCRC'08) 503/1-001 - Council ChamberSpeaker: Jamie Shiers (CERN)
-
HEPiX "bazaar and thinktank" 503/1-001 - Council Chamber
- 45
-
46
Implementation of the National Analysis Facility at DESYA common analysis facility for german physicists working on LHC and ILC topics is currently being built, starting at DESY's two sites in Hamburg and Zeuthen. We present the current technical implementation of this distributed facility.Speaker: Stephan Wiesand (DESY)
-
10:45
Coffee break
- 47
-
48
A redundant Servercluster using SL5.1 and XenHowto build a redundant (failover) Servercluster using Scientific Linux 5.1 with Xen. The services (filer, dns, dhcp, printing etc.) will be divided into individual virtual machines to isolate the services. On failure of a server, the services will be relocated automatically. The system is using a redundant Fibrechannel storage as the shared storage, but same would be possible using other techniques. The talk shows the architecture of the system, as well as some problems we found both in the SL Clustersoftware and SL's Xen Implementation and how we avoid them.Speaker: Klaus Steinberger (LMU München)
-
49
Indico at DESYThe Indico computer system at DESY consist of loadbalanced application servers, a central database server and a media archive on high availability storage. Currently, more than 700 events and users are organised in Indico at DESY. This talk will give an overview about the setup, the experiences we had and how this service is organised and supported at DESY.Speaker: Alexander Krieg (DESY)
-
12:45
Lunch break CERN restaurants
CERN restaurants
CERN
Route de Meyrin CH-1211 Genève 23 Switzerland -
HEPiX "bazaar and thinktank" 503/1-001 - Council Chamber
- 50
-
CPU technology 503/1-001 - Council Chamber
-
51
Status report from the Benchmarking working groupStatus report from the Benchmarking working groupSpeaker: Helge Meinhard (CERN)
- 52
-
15:30
Coffee break
- 53
- 54
- 55
- 56
-
57
Benchmarking Multi-core Processors in the NERSC Production Environment.Results of benchmarking based on HENP production codes from single, dual and quad - core processors will be presented. We will look at ratios of CPU/wall clock, memory consumption and I/O performance of those codes on the latest additions to the PDSF production systems.Speaker: Iwona Sakrejda (LBNL/NERSC)
-
51
-
44
-
-
58
LHC Networking to the Tier-1's, Present and Future 503/1-001 - Council ChamberSpeaker: David Foster (CERN)
-
Networking infrastructure and computer security 503/1-001 - Council Chamber
-
59
IPv6 experience in a mixed unix environmentWe have been using the Academic Computer Club at Umeå University as our testbed for IPv6 deployment, this includes some popular public services like a rather big free software mirror as well as workstations and multi-user machines in a semi-production environment. This talk gives an overview of the various pitfalls as well as what "just works", with some pointers on what kind of systems and software need attention.Speaker: Mattias Wadenstein (NDGF / HPC2N / ACC)
-
60
Advanced Monitoring Techniques for the Atlas TDAQ Data NetworkWe describe the methods used to monitor and measure the performance of the Atlas TDAQ data network. The network consists of four distinct Ethernet networks interconnecting over 4000 ports using up to 200 edge switches and five multi-blade chassis. The edge networks run at 1Gbps and 10Gb/s are used for the detectors raw data flow as well as at the cores of the data flow networks. The networks feed event data to farms of up to 3000 processors. Trigger applications running on these processors examine each event for acceptability and assemble the accepted events ready for storage and further processing on Grid linked data centers. We report in detail on the monitoring and measurement techniques deployed and developed.Speaker: Matei Ciobotaru (CERN, UC Irvine, Politehnica Bucharest)
-
10:45
Coffee break
-
61
Operational security in a grid environmentThis talk presents the main goals of computer security in a grid environment, by using a FAQ approach. It details the evolution of the risks in the recent years, likely objectives for attackers and the progress made by the malware toolkits and frameworks. Finally, recommendations to deal with these threats are proposed.Speaker: Romain Wartel (CERN)
-
62
Cybersecurity UpdateAn update on recent security issues and vulnerabilities affecting Windows, Linux and Mac platforms. This talk is based on contributions and input from a range of colleagues both within and outside CERN. It covers clients, servers and control systems.Speaker: Lionel Cons (CERN)
- 63
-
59
-
58