Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

CHEP 07

2–9 Sept 2007

Victoria, Canada

Europe/Zurich timezone

Please book accomodation as soon as possible.

Support

chep07-support@triumf.ca

Session

Computer facilities, production grids and networking

CF

3 Sept 2007, 14:00

Victoria, Canada

Victoria, Canada

Computer facilities, production grids and networking: CF 1

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 2

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 3

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 4

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 5

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 6

Kors Bos (NIKEF)

Computer facilities, production grids and networking: CF 7

Kors Bos (NIKEF)

There are no materials yet.

31. Lessons Learnt From WLCG Service Deployment

Dr Jamie Shiers (CERN)

03/09/2007, 14:00

Computer facilities, production grids and networking

oral presentation

This talk summarises the main lessons learnt from deploying WLCG production services, with a focus on Reliability, Scalability, Accountability, which lead to both manageability and usability. Each topic is analysed in turn. Techniques for zero-user-visible downtime for the main service interventions are described, together with pathological cases that need special treatment. The...

41. Security Incidents management in a Grid environment

Dr Markus Schulz (CERN)

03/09/2007, 14:20

Computer facilities, production grids and networking

oral presentation

Today's production Grids connect large numbers of distributed hosts using high throughput networks and hence are valuable targets for attackers. In the same way users transparently access any Grid service independently of its location, an attacker may attempt to propagate an attack to different sites that are part of a Grid. In order to contain and resolve the incident, and since such an...

42. The Open Science Grid - Its Status and Implementation Architecture

Mrs Ruth Pordes (FERMILAB)

03/09/2007, 14:40

Computer facilities, production grids and networking

oral presentation

The Open Science Grid (OSG) is receiving five years of funding across six program offices of the Department of Energy Office of Science and the National Science Foundation. OSG is responsible for operating a secure production-quality distributed infrastructure, a reference software stack including the Virtual Data Toolkit (VDT), extending the capabilities of the high throughput virtual...

75. UK Grid Computing for High-Energy Physics

Dr Jeremy Coles (RAL)

03/09/2007, 15:00

Computer facilities, production grids and networking

oral presentation

Over the last few years, UK research centres have provided significant computing resources for many high-energy physics collaborations under the guidance of the GridPP project. This paper reviews recent progress in the Grid deployment and operations area including findings from recent experiment and infrastructure service challenges. These results are discussed in the context of how GridPP...

427. CDF offline computing'07: computing of a HEP experiment in a mature stage

Dr Pavel Murat (Fermilab)

03/09/2007, 15:20

Computer facilities, production grids and networking

oral presentation

CDFII detector at Fermilab is taking physics data since 2002. The architechture of the CDF computing system has substantially evolved during the years of the data taking and currently it reached stable configuration which will allow experiment to process and analyse the data until the end of Run II. We describe major architechtural components of the CDF offline computing - dedicated...

147. A Distriuted Tier-1 for WLCG

Mr Lars Fischer (Nordic Data Grid Facility)

03/09/2007, 15:40

Computer facilities, production grids and networking

oral presentation

The Tier-1 facility operated by the Nordic DataGrid Facility (NDGF) differs significantly from other Tier-1s in several aspects: It is not located one or a few locations but instead distributed throughout the Nordic, it is not under the governance of a single organization but instead a "virtual" Tier-1 build out of resources under the control of a number of different national...

372. PetaCache: Data Access Unleashed

Dr Richard Mount (SLAC)

03/09/2007, 16:30

Computer facilities, production grids and networking

oral presentation

The PetaCache project started at SLAC in 2004 with support from DOE Computer Science and the SLAC HEP program. PetaCache focuses on using cost-effective solid state storage for the hottest data under analysis. We chart the evolution of metrics such as accesses per second per dollar for different storage technologies and deduce the near inevitability of a massive use of solid- state...

62. CASTOR2: design and development of a scalable architecture for a hierarchical storage system at CERN

Dr Giuseppe Lo Presti (CERN/INFN)

03/09/2007, 16:50

Computer facilities, production grids and networking

oral presentation

In this paper we present the architecture design of the CERN Advanced Storage system (CASTOR) and its new disk cache management layer (CASTOR2). Mass storage systems at CERN have evolved over time to meet growing requirements, both in terms of scalability and fault resiliency. CASTOR2 has been designed as a Grid-capable storage resource sharing facility, with a database-centric...

65. Experiences with gStore, a scalable Mass Storage System with Tape Backend

Dr Horst Goeringer (GSI)

03/09/2007, 17:10

Computer facilities, production grids and networking

oral presentation

GSI in Darmstadt (Germany) is a center for heavy ion research and hosts an Alice Tier2 center. For the future FAIR experiments at GSI, CBM and Panda, the planned data rates will reach those of the current LHC experiments at Cern. Since more than ten years gStore, the GSI Mass Storage System, is successfully in operation. It is a hierarchical storage system with a unique name...

72. Advances in Integrated Storage,Transfer and Network Management

Paul Avery (University of Florida)

03/09/2007, 17:30

Computer facilities, production grids and networking

oral presentation

UltraLight is a collaboration of experimental physicists and network engineers whose purpose is to provide the network advances required to enable and facilitate petabyte-scale analysis of globally distributed data. Existing Grid-based infrastructures provide massive computing and storage resources, but are currently limited by their treatment of the network as an external, passive, and...

202. Distributed Cluster dynamic storage: A comparison of dcache, xrootd and slashgrid storage systems running on batch nodes.

Ms Alessandra Forti (University of Manchester)

03/09/2007, 17:50

Computer facilities, production grids and networking

oral presentation

The HEP department of the University of Manchester has purchased a 1000 nodes cluster. The cluster is dedicated to run EGEE and LCG software and is currently supporting 12 active VOs. Each node is equipped with 2x250 GB disks for a total amount of 500 GB and there is no tape storage behind nor raid arrays are used. Three different storage solutions are currently being deployed to...

369. CMS Experiences with Computing Software and Analysis Challenges

Dr Ian Fisk (FNAL)

04/09/2007, 11:00

Computer facilities, production grids and networking

oral presentation

In preparation for the start of the experiment, CMS has conducted computing, software, and analysis challenges to demonstrate the functionality, scalability, and useability of the computing and software components. These challenges are designed to validate the CMS distributed computing model by demonstrating the functionality of many components simultaneously. In the challenges CMS...

70. Quattor and QWG Templates : efficient management of (complex) grid sites

Mr Michel Jouvin (LAL / IN2P3)

04/09/2007, 11:20

Computer facilities, production grids and networking

oral presentation

Quattor is a tool aimed at efficient management of fabrics with hundred or thousand of Linux machines, still being easy enough to manage smaller clusters. It has been originally developed inside the European Data Grid (EDG) project. It is now in use at more than 30 grid sites running gLite middleware, ranging from small LCG T3 to very large one like CERN. Main goals and specific...

76. Global Grid User Support - Building a worldwide distributed user support infrastructure

Torsten Antoni (Forschungszentrum Karlsruhe)

04/09/2007, 11:40

Computer facilities, production grids and networking

oral presentation

The organization and management of the user support in a global e-science computing infrastructure such as EGEE is one of the challenges of the grid. Given the widely distributed nature of the organisation, and the spread of expertise for installing, configuring, managing and troubleshooting the grid middleware services, a standard centralized model could not be deployed in EGEE. This...

379. Monitoring the EGEE/WLCG Grid Services

Mr Antonio Retico (CERN)

04/09/2007, 12:00

Computer facilities, production grids and networking

oral presentation

Grids have the potential to revolutionise computing by providing ubiquitous, on demand access to computational services and resources. They promise to allow for on demand access and composition of computational services provided by multiple independent sources. Grids can also provide unprecedented levels of parallelism for high-performance applications. On the other hand, grid...

171. Production Experience with Distributed Deployment of Databases for the LHC Computing Grid

Dirk Duellmann (CERN)

05/09/2007, 14:00

Computer facilities, production grids and networking

oral presentation

Relational database services are a key component of the computing models for the Large Hadron Collider (LHC). A large proportion of non-event data including detector conditions, calibration, geometry and production bookkeeping metadata require reliable storage and query services in the LHC Computing Grid (LCG). Also core grid services to catalogue and distribute data cannot operate...

263. Large-scale ATLAS Production on EGEE

Dr Xavier Espinal (PIC/IFAE)

05/09/2007, 14:20

Computer facilities, production grids and networking

oral presentation

In preparation for first data at the LHC, a series of Data Challenges, of increasing scale and complexity, have been performed. Large quantities of simulated data have been produced on three different Grids, integrated into the ATLAS production system. During 2006, the emphasis moved towards providing stable continuous production, as is required in the immediate run-up to first data, and...

285. CMS Monte Carlo production in the WLCG Computing Grid

Mr Jose Hernandez Calama (CIEMAT)

05/09/2007, 14:40

Computer facilities, production grids and networking

oral presentation

Monte Carlo production in CMS has received a major boost in performance and scale since last CHEP conference. The production system has been re-engineered in order to incorporate the experience gained in running the previous system and to integrate production with the new CMS event data model, data management system and data processing framework. The system is interfaced to the two...

184. ATLAS Production Experience on OSG Infrastructure

Smirnov Yuri (Brookhaven National Laboratory)

05/09/2007, 15:20

Computer facilities, production grids and networking

oral presentation

The Open Science Grid infrastructure provides one of the largest distributed computing systems deployed in the ATLAS experiment at the LHC. During the CSC exercise in 2006-2007, OSG resources provided about one third of the worldwide distributed computing resources available in ATLAS. About half a petabyte of ATLAS MC data is stored on OSG sites. About 2000k SpecInt2000 CPU's is available....

235. CMS MC Production System Development & Design

Mr Dave Evans (Fermi National Laboratory)

05/09/2007, 15:40

Computer facilities, production grids and networking

oral presentation

The CMS production system has undergone a major architectural upgrade from its predecessor, with the goals of reducing the operations manpower requirement and preparing for the large scale production required by the CMS physics plan. This paper discusses the CMS Monte Carlo Workload Management architecture. The system consist of 3 major components: ProdRequest, ProdAgent, and ProdMgr...

246. Use of Alternate Path WAN Circuits for CMS

Mr Philip DeMar (FERMILAB)

05/09/2007, 16:30

Computer facilities, production grids and networking

oral presentation

Fermilab hosts the American Tier-1 Center for the LHC/CMS experiment. In preparation for the startup of CMS, and building upon extensive experience supporting TeVatron experiments and other science collaborations, the Laboratory has established high bandwidth, end-to-end (E2E) circuits with a number of US-CMS Tier2 sites, as well as other research facilities in the collaboration. These...

249. Lambda Station: Alternate Network Path Forwarding for Production SciDAC Applications

Mr Maxim Grigoriev (FERMILAB)

05/09/2007, 16:50

Computer facilities, production grids and networking

oral presentation

The LHC experiments will start very soon, creating immense data volumes capable of demanding allocation of an entire network circuit for task-driven applications. Circuit-based alternate network paths are one solution to meeting the LHC high bandwidth network requirements. The Lambda Station project is aimed at addressing growing requirements for dynamic allocation of alternate network...

347. HEP grids face IPv6: A readiness study

Dr Matt Crawford (FERMILAB)

05/09/2007, 17:10

Computer facilities, production grids and networking

oral presentation

Due to shortages of IPv4 address space - real or artificial - many HEP computing installations have turned to NAT and application gateways. These workarounds carry a high cost in application complexity and performance. Recently a few HEP facilities have begun to deploy IPv6 and it is expected that many more must follow within several years. While IPv6 removes the problem of address...

251. Deploying perfSONAR-based End-to-End Monitoring for Production US-CMS Networking

Mr Maxim Grigoriev (FERMILAB)

05/09/2007, 17:30

Computer facilities, production grids and networking

oral presentation

End-to-end (E2E) circuits are used to carry high impact data movement into and out of the US CMS Tier-1 Center at Fermilab. E2E circuits have been implemented to facilitate the movement of raw experiment data from Tier-0, as well as processed data to and from a number of the US Tier-2 sites. Troubleshooting and monitoring those circuits presents a challenge, since the circuits typically...

123. The ATLAS T0 Software Suite

Dr Luc Goossens (CERN)

05/09/2007, 17:50

Computer facilities, production grids and networking

oral presentation

ATLAS is a multi-purpose experiment at the LHC at CERN, which will start taking data in November 2007. To handle and process the unprecedented data rates expected at the LHC (at nominal operation, ATLAS will record about 10 PB of raw data per year) poses a huge challenge on the computing infrastructure. The ATLAS Computing Model foresees a multi-tier hierarchical model to perform this...

248. The EELA Grid Infrastructure and HEP Applications in Latin America

Dr Lukas Nellen (I. de Ciencias Nucleares, UNAM)

06/09/2007, 14:00

Computer facilities, production grids and networking

oral presentation

The EELA project aims at building a grid infrastructure in Latin America and at attracting users to this infrastructure. The EELA infrastructure is based on the gLite middleware, developed by the EGEE project. A test-bed, including several European and Latin American countries, was set up in the first months of the project. Several applications from different areas, especially...

84. ATLAS Distributed Data Management Operations. Experience and Projection

Dr Alexei Klimentov (BNL)

06/09/2007, 14:20

Computer facilities, production grids and networking

oral presentation

ATLAS Distributed Data Management Operations Team unites experts from Tier-1s and Tier-2s computer centers. The group is responsible for all day by day ATLAS data distribution between different sites and centers. In our paper we describe ATLAS DDM operation model and address the data management and operation issues. A serie of Functional Tests have been conducted in the past and is in...

281. Computing Operations at CMS Facilities

Dr Daniele Bonacorsi (INFN-CNAF, Bologna, Italy)

06/09/2007, 14:40

Computer facilities, production grids and networking

oral presentation

The CMS experiment is gaining experience towards the data taking in several computing preparation activities, and a roadmap towards a mature computing operations model stands as a primary target. The responsibility of the Computing Operations projects in the complex CMS computing environment spawns a wide area and aims at integrating the management of the CMS Facilities Infrastructure,...

395. Storage management solutions and performance tests at INFN Tier-1

Luca dell'Agnello (INFN-CNAF)

06/09/2007, 15:00

Computer facilities, production grids and networking

oral presentation

Performance, reliability and scalability in data access are key issues when considered in the context of HEP data processing and analysis applications. The importance of these topics is even larger when considering the quantity of data and the request load that a LHC data centers has to support. In this paper we give the results and the technical details of a large scale validation,...

153. Streamlining and Scaling Castor2 Operations

Jan van ELDIK (CERN)

06/09/2007, 15:20

Computer facilities, production grids and networking

oral presentation

This paper presents work, both completed and planned, for streamlining the deployment, operation and re-tasking of Castor2 instances. We present a summary of what has recently been done to reduce the human intervention necessary for bringing systems into operation; including the automation of Grid host certificate requests and deployment in conjunction with the CERN Trusted CA and...

331. Implementing SRM V2.2 Functionality in dCache

Mr Timur Perelmutov (FERMILAB)

06/09/2007, 15:40

Computer facilities, production grids and networking

oral presentation

The Storage Resource Manager (SRM) and WLCG collaborations recently defined version 2.2 of the SRM protocol, with the goal of satisfying the requirement of the LCH experiments. The dCache team has now finished the implementation of all SRM v2.2 elements required by the WLCG. The new functions include space reservation, more advanced data transfer, and new namespace and permission...

356. Interfacing with Sun Utility Computing, experience with on demand physics simulations on SunGrid

Dr Maxim Potekhin (BROOKHAVEN NATIONAL LABORATORY)

06/09/2007, 16:30

Computer facilities, production grids and networking

oral presentation

The simulation program for the STAR experiment at Relativistic Heavy Ion Collider at Brookhaven National Laboratory is growing in scope and responsiveness to the needs of the research conducted by the Physics Working Groups. In addition, there is a significant ongoing R&D activity aimed at future upgrades of the STAR detector, which also requires extensive simulations support. The...

92. Addressing the Pilot Security Problem With gLExec

Igor Sfiligoi (Fermilab)

06/09/2007, 16:50

Computer facilities, production grids and networking

oral presentation

Pilot jobs are becoming increasingly popular in the Grid world. Experiments like ATLAS and CDF are using them in production, while others, like CMS, are actively evaluating them. Pilot jobs enter Grid sites using a generic pilot credential, and once on a worker node, call home to fetch the job of an actual user. However, this operation mode poses several new security problems when...

303. Experience with the gLite Workload Management System in ATLAS Monte Carlo Production on LCG

Dr Simone Campana (CERN/IT/PSS)

06/09/2007, 17:10

Computer facilities, production grids and networking

oral presentation

The ATLAS experiment has been running continuous simulated events production since more than two years. A considerable fraction of the jobs is daily submitted and handled via the gLite Workload Management System, which overcomes several limitations of the previous LCG Resource Broker. The gLite WMS has been tested very intensively for the LHC experiments use cases for more than six months,...

420. Running CE and SE in a Xen-virtualized environment

Mr Sergey Chechelnitskiy (Simon Fraser University)

06/09/2007, 17:30

Computer facilities, production grids and networking

oral presentation

SFU is responcible for running two different clusters - one is designed for WestGrid internal jobs with its specific software and the other should run Atlas jobs only. In addition to different software configuration the Atlas cluster should have a diffener networking confirugation. We would also like to have a flexibility of running jobs on different hardware. That is why it has been...

425. BNL dCache Status and Plan

Ms Zhenping Liu (BROOKHAVEN NATIONAL LABORATORY)

06/09/2007, 17:50

Computer facilities, production grids and networking

oral presentation

BNL ATLAS Computing Facility needs to provide a Grid-based storage system with these requirements: a total of one gigabyte per second of incoming and outgoing data rate between BNL and ATLAS T0, T1 and T2 sites, thousands of reconstruction/analysis jobs accessing locally stored data objects, three petabytes of disk/tape storage in 2007 scaling up to 25 petabytes by 2011, and a...

Building timetable...