CHEP 06

Europe/Zurich
Tata Institute of Fundamental Research

Tata Institute of Fundamental Research

Homi Bhabha Road Mumbai 400005 India
Description
Computing in High Energy and Nuclear Physics
    • Registration
    • Plenary: Plenary 1 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      • 1
        Opening of The Coference
      • 2
        Status of the LHC Machine
        Speaker: Dr Jos Engelen (CERN)
        Slides
    • 10:30
      Tea Break
    • Plenary: Plenary 2 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Wolfgang von Rueden (CERN)
      • 3
        State of Readiness of LHC Computing Infrastructure
        Speaker: Dr Jamie Shiers (CERN)
      • 4
        State of readiness of LHC Experiment Software
        Speaker: Dr Paris Sphicas (CERN)
        Slides
      • 5
        Low Cost Connecivity Initiative in India
        Speaker: Dr Ashok Jhunjhunwala (IIT, Chennai)
        Slides
    • Poster: Poster 1
      • 6
        A case for application-aware grid services.
        In 2005, the DZero Data Reconstruction project processed 250 tera-bytes of data on the Grid, using 1,600 CPU-years of computing cycles in 6 months. The large computational task required a high-level of refinement of the SAM-Grid system, the integrated data, job, and information management infrastructure of the RunII experiments at Fermilab. The success of the project was in part due to the ability of the SAM-Grid to adapt to the local configuration of the resources and services at the participating sites. A key aspect of such adaptation was coordinating the resource usage in order to optimize the typical access patterns of the DZero reprocessing application. Examples of such optimizations include database access, data storage access, and worker nodes allocation and utilization. A popular approach to implement resource coordination on the grid is developing services that understand application requirements and preferences in terms of abstract quantities e.g. required CPU cycles or data access pattern characteristics. On the other hand, as of today, it is still difficult to implement real-life resource optimizations using such level of abstraction. First, this approach assumes maximum knowledge of the resource/service interfaces from the users and the applications. Second, it requires a high level of maturity for the grid interfaces. To overcome these difficulties, the SAM-Grid provides resource optimization implementing application-aware grid services. For a known application, such services can act in concert maximizing the efficiency of the resource usage. This paper describes what optimizations the SAM-Grid framework had to provide to serve the DZero reconstruction and montecarlo production. It also shows how application-aware grid services fulfill the task.
        Speaker: Garzoglio Gabriele (FERMI NATIONAL ACCELERATOR LABORATORY)
        Paper
        Poster
      • 7
        A Computational and Data Scheduling Architecture for HEP Applications
        This paper discusses an architectural approach to enhance job scheduling in data intensive applications in HEP computing. First, a brief introduction to the current grid system based on LCG/gLite is given, current bottlenecks are identified and possible extensions to the system are described. We will propose an extended scheduling architecture, which adds a scheduling framework on top of existing compute and storage elements. Goal is an improved and better coordination between data management and workload management. This includes more precise planning and prediction of file availability prior to job allocation to compute elements, as well as better integration of local job and data scheduling to improve response times and through-put. Subsequently, the underlying components are presented, where for the design of the computing element standard grid components are used. The storage element is based on the dCache software package that supports a scalable storage and data access solution, which is enhanced in a way that it can interact with scheduling services. For broader acceptance of the scheduling solution in Grid communities beyond High Energy Physics, an outlook is given on how the scheduling framework can be adapted to other application scenarios like e.g. the climate community. The project is funded by the German Ministry of Education and Science as part of the national e-science initiative D-Grid and is jointly carried out by IRF-IT of University of Dortmund and DESY.
        Speaker: Mr Lars Schley (University Dortmund, IRF-IT, Germany)
        Paper
        Poster
      • 8
        A Diskless Solution for LCG Middleware
        The INFN-GRID project allows experimenting and testing many different and innovative solutions in the GRID environment. In this research ad development it is important to find the most useful solutions for simplified the managment and access to the resources. In the VIRGO laboratory in Napoli we have tested a non standard implementation based on LCG 2.6.0 by using a diskless solution in order to simplify the site administration. We have also used a shared file system, thus simplifying the submission of MPI jobs to the grid. Finally we tested the use of a custom kernel with the OpenMosix patch in order to enable the dynamic load balancing.
        Speaker: Dr Silvio Pardi (DIPARTIMENTO DI MATEMATICA ED APPLICAZIONI "R.CACCIOPPOLI")
        Paper
      • 9
        A Gauge Model of Data Acquisition
        Traditionally, in the pre-LHC muti-purpose high-energy experiements the diversification of their physics programs has been largely decoupled from the process of the data-taking - physics groups could only influence the selection criteria of recorded events according to predefined trigger menus. In particular, the physics-oriented choice of subdetector data and the implementation of refined event selection methods have been made in the offline analysis. The departure point of the Gauge Model of the Data Acquisition is that such a scheme cannot be continously extended to the LHC environment, without significant sacrifices in the scope and in the quality of the experimental program. The model is contructed in close analogy to the construction of the gauge models in particles physics and is based upon the dynamic event-by-event, steering of the format and content of the detector raw data, dynamic configuration of the High-Level-Trigger selection algorithms and Physics-goal-oriented slices of processor farms. In this tals I shall present the advantages and drawbacks and of such a model of the data taking architecture for the physics program of the LHC collider.
        Speaker: Mieczyslaw Krasny (LPNHE, Uviversity Paris)
      • 10
        A General Jet Tagging Environment for the ATLAS Detector Reconstruction Software
        The design of a general jet tagging algorithm for the ATLAS detector reconstruction software is presented. For many physics analyses, reliable and efficient flavour identification, 'tagging', of jets is vital in the process of reconstructing the physics content of the event. To allow for a broad range of identification methods emphasis is put on the flexibility of the framework. A guiding design principles of the jet tagging software is a strong focus on modularity and defined interfaces using the advantages of the new ATLAS Event Data Model and object oriented C++. The benefit for the developer is modularity of the design in terms of expandability of the tagging software with additional and modified algorithms. The user profits from common interfaces to all algorithms and also from a simple jet tagging configuration. The usage of different methods, re-doing the tagging procedure with a modified setup and combining the results from various methods during the analysis. The ATLAS b-tagging algorithms have been migrated into this new jet tagging environment and the concrete implementation is used to demonstrate the benefits of the proposed design.
        Speaker: Mr Andreas Wildauer (UNIVERSITY OF INNSBRUCK)
        Paper
        Poster
      • 11
        A Monitoring Subscription Language in the Framework of Distributed System
        The Grid technology is attracting a lot of interest, involving hundreds of researchers and software engineers around the world. The characteristics of Grid demand the developing of suitable monitoring system able to obtain the significant information in order to make management decision and control system behaviour. In this paper we are going to analyse a formal declarative interpreted language for the description of monitoring events. An user expresses his interest in verifying of certain events, for example he is interested in the occurring of subsequent composite events : e1:= ‘workload for ten minutes > K’, after e2:=’number of active Machine<y’, so he subscribes the event e:= e1 after e2. This language, inspired to Generalised Event Monitoring Language (GEM)[1], allows high level subscriptions to be specified as composition of atomic subscriptions and integrates the concept of real time. The language allows to express many temporal constraint, which would have been very difficult to specify in a distributed systems. The goal of our research project consist of the tree steps: 1) the description of subscriptions though the use of a formal language; 2) the translating of the problem in a XML frame, using XML metalanguage tools; 3) the integration of this new language, made ‘ad-hoc’, in a monitoring services. [1] A Generalised Event Monitoring Language for Distributed Systems, Masoud Mansouri- Samani, Morris Sloman, IEE/IOP/BCS Distributed Systems Engineering Journal, Vol. 4, No 2 June.
        Speaker: Dr Rosa palmiero (INFN and University of Naples)
        Paper
        Poster
      • 12
        Advances in Fabric Management by the CERN-BARC collaboration
        The collaboration between BARC and CERN is driving a series of enhancements to ELFms [1], the fabric management tool-suite developed with support from the HEP community under CERN's coordination. ELFms components are used in production at CERN and a large number of other HEP sites for automatically installing, configuring and monitoring hundreds of clusters comprising of thousands of nodes. Developers at BARC and CERN are working together to improve security, functionality and scalability in the light of feedback from site administrators. In a distributed Grid computing environment with thousands of users accessing thousands of nodes, reliable status and exception information is critical at each site and across the grid. It is therefore important to ensure the integrity, authenticity and privacy of information collected by the fabric monitoring system. A new layer has been added to Lemon, the ELFms monitoring system, to enable the secure transport of monitoring data between monitoring agents and servers by using a modular plug-in architecture that supports RSA/DSA keys and X509 certificates. In addition, the flexibility and robustness of Lemon has been further enhanced by the introduction of a modular configuration structure, the integration of exceptions with the alarm system and the development of fault tolerant components that enable automatic recovery from exceptions. To address operational scalability issues, CCTracker, a web-based visualization tool, is being developed. It provides both physical and logical views of a large Computer Centre and enables authorized users to locate objects and perform high-level operations across sets of objects. Operations staff will be able to view and plan elements of the physical infrastructure and initiate hardware management workflows such as mass machine migrations or installations. Service Managers will be able to easily manipulate clusters or sets of nodes, modifying settings, rolling out software-updates and initiating high-level state changes. [1] http://cern.ch/elfms
        Speaker: Mr William Tomlin (CERN)
        Paper
        Poster
      • 13
        An Identity Server Portal for Global Accelerator and Detector Networks
        The next generations of large colliders and their experiments will have the advantage that groups from all over the world will participate with their competence to meet the challenges of the future. Therefore it’s necessary to become even more global than in the past, giving members the option of remote access to most controlling parts of this facilities. The experience in the past has shown that a number of problems result from the existing variety of computer systems and their graphical user interfaces which are incompatible to other systems and the possible options to reach them from outside the experimental area. A group at Trieste and DESY is working inside the GANMVL (Global Accelerator Network Multipurpose Virtual Laboratory) project to solve this problem, finding a simple way for the consumer to have remote access with single sign-on personalisation and admission to several systems. We determine problems arising in the implementation of user friendly interfaces, in achieving a look and feel close to the real hardware and in handling software. Also in the future it should be possible to have access simply via any internet browser, without any knowledge about the computer operating systems inside the large facilities. Only one login procedure should be necessary to have access to every integrated system. The current project status shall be outlined.
        Speaker: Mr Sven Karstensen (DESY Hamburg)
        Paper
        Poster
      • 14
        An XML-based configuration and management system for the gLite middleware
        gLite is the next generation middleware for grid computing. Born from the collaborative efforts of more than 80 people in 12 different academic and industrial research centers as part of the EGEE Project, gLite provides a bleeding-edge, best- of-breed framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet. Currently, gLite is composed of more than 25 different services, implemented in different languages, using different technologies and all coming with individual configuration needs. In addition gLite can be run in multiple operational scenarios and supports presently hundreds of configuration options. Past experience has shown that configuration and management are one of the biggest challenges of such a system. In order to ease configuration and deployment of such a complex system, we have developed a configuration model that offers the users and administrators of gLite a homogeneous, easy to use, extensible and flexible system across all services, yet fitting the needs of the different services and providing the necessary security. It includes an XML-encoded common configuration storage format across all services with pre-configured configuration templates guiding the user in the proper selection of the configuration scenarios. This includes validation of the stored configuration information and tools to manipulate the information and automatically produce documentation or transform the information so that it can be used by high-level system management tools. The services can obtain the configuration information either from local storage or from a central configuration service using standard grid security based on certificates and VO memberships and roles. After a discussion of the environment and challenges, the paper will present a detailed description of the developed solutions together with a discussion of the problems and future developments.
        Speaker: Dr Joachim Flammer (CERN)
        Poster
      • 15
        Application of Maximum Likelihood Method for the computation of Spin Density Matrix Elements of Vector Meson Production at HERMES.
        HERMES experiment at DESY has performed extensive measurements on diffractive production of light vector mesons (rho^0, omega, phi) in the intermediate energy region. Spin density matrix elements (SDMEs) were determined for exclusive diffractive rho^0 and phi mesons and compared with results of high energy experiments. Several methods for the extraction of SDMEs have been applied on the same data sample. A comparision between those methods is given. Maximum Likelihood method was choosed for the final computation of two sets of 23 SDMEs on hydrogen and deuterium HERMES data. These results yield more insight in the vector meson production mechanisms at HERMES kinematics and in the helicity transfer in diffractive vector meson production, which allows to see new results on the s-channel helicity violation and an indication on the contribution of unnatural parity exchange amplitudes.
        Speaker: Dr Alexander Borissov (University of Glasgow, Scotland, UK)
        Paper
        Slides
      • 16
        ARGO-YBJ experimental data transfer and processing : experience with the Computer Farm and evolution of the computing model to a GRID approach.
        The data taking of ARGO-YBJ experiment in Tibet is operational with 54 RPC clusters installed and is moving rapidly to more than 100 clusters configuration. The paper describes the processing of this phase experimental data , based on a local computer farm. The software developed for the data management, job submission and information retrieval is described together to the performance aspects. The evolution of the ARGO computing model using the possibility to transfer the experimental data via network and to integrate it in the GRID environment with the definition of an ARGO Virtual organization andword-wide resources is also presented.
        Speaker: Dr Cristian Stanescu (Istituto Nazionale Fisica Nucleare - Sezione Roma III)
      • 17
        BaBar SPGrid - Putting BaBar's Simulation Production on The Grid
        For the BaBar Computing Group We describe enhancements to the BaBar Experiment's distributed Monte Carlo generation system to make use of European and North American GRID resources and present the results with regard to BaBar's latest cycle of Monte-Carlo production. We compare job success rates and manageability issues between GRID and non-GRID production and present an investigation into the efficiency costs of different methods of making input data, in the form of files and database information, available to the job in a distributed environment.
        Speakers: Dr Alessandra Forti (Univ.of Milano Faculty of Art), Dr Chris Brew (CCLRC - RAL)
        Paper
        Poster
      • 18
        Building a Federated Tier2 Center to support Physics Analysis and MC Production for multiple LHC Experiments
        The IT Group at DESY is involved in a variety of projects ranging from Analysis of High Energy Physics Data at the HERA Collider and Synchrotron Radiation facilities to cutting edge computer science experiments focused on grid computing. In support of these activities members of the IT group have developed and deployed a local computational facility which comprises many service nodes, computational clusters and large scale disk and tape storage services. The resources contribute collectively or individually to a variety of production and development activities such as the main analysis center for the presently running HERA experiments, a German Tier2 center for the ATLAS and CMS experiments at the Large Hadron Collider (LHC) and research on grid computing for the EGEE and D-Grid projects. Installing and operating a great variety of services required to run a sizable Tier2 center as a federated facility is a major challenge. The anticipated computing and storage capacity for the start of LHC data taking is O(2000)kSI2K and O(700)TB disk. Given local constraints and particular expertise at the sites the DESY IT Group and the Physics Group at RWTH Aachen, both having their facilities in two distinct locations that are about 500km apart, are in the process of building such a center for CMS. The anticipated conceptual design is based on a network-centric architecture allowing the installation and operation of selected services where they fit most optimally the boundary conditions as they do exist at either institution. While the group at RWTH is having their focus on LHC physics, interfacing to data processing at a fairly high level, a considerable amount of expertise on processing of petabyte scale physics data in a highly distributed environment exists at DESY. In this paper we describe the architecture, the distribution of the services, the anticipated operational model and finally the advantages and disadvantages of using such a scheme to manage a large scale federated facility.
        Speaker: Dr Michael Ernst (DESY)
      • 19
        CLAS Experimental Control System
        Software agent based control system is implemented to control experiments running on the CLAS detector at Jefferson Lab. Within the CLAS experiments DAQ, trigger, detector and beam line control systems are both logically and physically separated, and are implemented independently using a common software infrastructure. CLAS experimental control system (ECS) was designed, using earlier developed FIPA agent based control framework, to glue the data production subsystems into the uniform control environment. The object models of the experiments are described using the developed Control Oriented Ontology Language (COOL), allowing detailed specification of the experimental objects such as their states, actions and associated conditions. Experiment objects are assigned Java agents, grouped into the control domains, operating in the open and distributed, multi-agent environment. Every agent domain, representing experiment subsystem, can operate independently, yet can summarize information for the above level domains or expand actions to the lower levels.
        Speaker: Vardan Gyurjyan (JEFFERSON LAB)
      • 20
        CMD-3 Detector Computing Environment Overview
        CMD-3 is the general purpose cryogenic magnetic detector for VEPP-2000 electron-positron collider, which is being commissioned at Budker Institute of Nuclear Physics (BINP, Novosibirsk, Russia). The main aspects of physical program of the experiment are study of known and search for new vector mesons, study of the ppbar a nnbar production cross sections in the vicinity of the threshold and search for exotic hadrons in the region of center-of-mass energy below 2 GeV. The essential upgrade of CMD-2 detector (designed for VEPP-2M collider at BINP) farm and distributed data storage management software is required to satisfy new detector needs and scheduled to perform in near future. The contribution gives the general overview of the computing environment to be used for the RAW data staging and processing, Monte Carlo generation and handling the various user analysis jobs. It includes the description of the CMD-3 Offline Farm with the dedicated Quattor package based deployment facilities, high level detector specific job submission interface on top of the TORQUE batch system and the adaptive Journaling Virtual File System (JVFS) dealing with the distributed data storage shared among the farm nodes. JVFS functionality involves the sophisticated replica management mechanisms and virtual file system optimization services for a single dedicated batch processing cluster. Though the listed products were initially proposed to be used within the CMD-3 project only, they can be easily adopted to the computing environment of any small and medium scale HEP experiment.
        Speaker: Mr Alexei Sibidanov (Budker Institute of Nuclear Physics)
        Poster
      • 21
        CMD-3 Detector Offline Software Development
        CMD-3 is the general purpose cryogenic magnetic detector for VEPP-2000 electron-positron collider, which is being commissioned at Budker Institute of Nuclear Physics (BINP, Novosibirsk, Russia). The main aspects of physical program of the experiment are study of known and search for new vector mesons, study of the ppbar a nnbar production cross sections in the vicinity of the threshold and search for exotic hadrons in the region of center-of-mass energy below 2 GeV. The essential upgrade of CMD-2 detector (designed for VEPP-2M collider at BINP) farm and distributed data storage management software is required to satisfy new detector needs and scheduled to perform in near future. In this talk I will present the general design overview and status of implementation of CMD-3 offline software for reconstruction, simulation and visualization. Software design standards for this project are object oriented programming techniques, C++ as a main language, Geant4 as an only simulation tool, Geant4 based detector geometry description, WIRED and HepRep based visualization, CLHEP library based primary generators and Linux as a main platform. The dedicated software development framework (Cmd3Fwk) was implemented in order to be the basic software integration solution and the persistency manager. The key features of the framework are modularity, dynamic data processing chain generation according to the XML modules configuration and on-demand data request mechanisms.
        Speaker: Mr Alexander Zaytsev (Budker Institute of Nuclear Physics (BINP))
        Paper
        Poster
      • 22
        Cmd3Fwk Software Development Framework
        CMD-3 is the general purpose cryogenic magnetic detector for VEPP-2000 electron-positron collider, which is being commissioned at Budker Institute of Nuclear Physics (BINP, Novosibirsk, Russia). The main aspects of physical program of the experiment are study of known and search for new vector mesons, study of the ppbar a nnbar production cross sections in the vicinity of the threshold and search for exotic hadrons in the region of center-of-mass energy below 2 GeV. The dedicated CMD-3 Software Development Framework (Cmd3Fwk) was implemented in order to be the basic software integration solution and the persistency manager for the detector reconstruction, MC simulation and the third level trigger subsystem. Software design standards for the project are object oriented programming techniques, C++ as a main language, GRID environment compatibility and Linux as a main platform. Recently, the core components of the Cmd3Fwk was moved to the separate detector independent software package (MetaFramework) with the aim to enable its usage outside the scope of the CMD-3 project. The key features of the MetaFramework are modularity, dynamic data processing chain generation according to the XML modules configuration and on-demand data request mechanisms. It also provides command line and graphical user interfaces for building XML configurations and running the data processing jobs. The MetaFramework is a powerful tool which can be used for development of the specialized adaptive data processing tools for various applications, for instance, for building small and medium scale HEP experiment specific data processing frameworks. The contribution gives the overview of the design features for both the Cmd3Fwk and the MetaFramework projects.
        Speaker: Mr Sergey Pirogov (Budker Institute of Nuclear Physics)
        Paper
        Poster
      • 23
        cMsg - A General Framework for Deploying Publish/Subscribe Interprocess Communication Systems
        cMsg is a highly extensible open-source framework within which one can deploy multiple underlying interprocess communication systems. It is powerful enough to support asyncronous publish/subscribe communications as well as synchronous peer-to-peer communications. It further includes a proxy system whereby client requests are transported to a remote server that actually connects to the underlying communication package, thus allowing e.g. vxWorks clients to interact with a communication system that does not support vxWorks directly. Finally, cMsg includes a complete full-featured publish/subscribe communication package. cMsg runs on Linux, various flavors of Unix, and vxWorks, and includes C, C++, and Java API's.
        Speaker: Mr Elliott Wolin (Jefferson Lab)
      • 24
        Computation of Nearly Exact 3D Electrostatic Field in Multiwire Chambers
        The three dimensional electrostatic field configuration in a multiwire proportional chamber (MWPC) has been simulated using an efficient boundary element method (BEM) solver set up to solve an integral equation of the first kind. To compute the charge densities over the bounding surfaces representing the system for known potentials, the nearly exact formulation of BEM has been implemented such that the discretisation of the integral equation leads to a set of linear algebraic equations. Since the solver uses exact analytic integral of Green function [1,2] to compute the electrostatic potential for a general charge distribution satisfying Poisson's equation, extremely precise results have been obtained despite the use of relatively coarse discretization. The surfaces of anode wires and cathode planes in the MWPC have been segmented in small cylindrical and rectangular elements, carrying uniform unknown surface charges distributed over the elements. The capacity coefficient matrix for such a charge distribution has been set up using the exact expressions of the new formulation. Finally, the surface charge densities have been computed by satisfying the boundary conditions, i.e., potentials at the centroid of the elements known from the given potential configuration. We have used a lower upper (LU) decomposition routine incorporating Crout's method of partial pivoting to solve the set of algebraic equations. From the computed surface charge densities, the potential or electric field at any point in the computational domain can be obtained by superposition of the contribution of the charge densities on the boundary elements. Using the solver, we have performed a detailed study of the three dimensional field configuration throughout the volume of the device. The solutions have been validated by successfully comparing the computed field with analytical results available for two-dimensional MWPCs. Significant deviations from this ideal mid-plane field have been observed towards the edges of the detector. We have also studied the influence of the edge configuration of the detector on these deviations. Utilizing the high precision and three-dimensional capability of this solver, a study has been carried out on the nature of the electrostatic forces acting on the anode wires and its variation with the change in the wire position. Significant positional variations have been observed which can have impact on future design and construction of MWPCs. References [1] N.Majumdar, S.Mukhopadhyay, Computation of Electrostatic Field near Three-Dimensional Corners and Edges, accepted for presentation in the International Conference on Computational and Experimental on Engineering and Sciences (ICCES05) to be held at the Indian Institute of Technology, Madras (Chennai) from Dec 01 to 06, 2005. [2] S.Mukhopadhyay, N.Majumdar, Development of a BEM Solver using Exact Expressions for Computing the Influence of Singularity Distributions accepted for presentation in ICCES05.
        Speaker: Dr Nayana Majumdar (Saha Institute of Nuclear Physics)
      • 25
        Conditions database and calibration software framework for ATLAS Monitored Drift Tube chambers
        The size and complexity of LHC experiments raise unprecedented challenges not only in terms of detector design, construction and operation, but also in terms of software models and data persistency. One of the more challenging tasks is the calibration of the 375000 Monitored Drift Tubes, that will be used as precision tracking detectors in the Muon Spectrometer of the ATLAS experiment. An accurate knowledge of the space-time relation is needed to reach the design average resolution of 80 microns. The MDT calibration software has been designed to extract the space-time relation from the data themselves, through the so-called auto-calibration procedure, to store and retrieve the relevant information from the conditions database, and to properly apply it to calibrate the hits to be used by the reconstruction algorithms, taking into account corrections for known effects like temperature and magnetic field. We review the design of the MDT calibration software for ATLAS and present performance results obtained with detailed GEANT4-based simulation and real data from the recent combined test beam. We discuss the implementation of the conditions database for MDT calibration data in the framework proposed by the LHC Computing Grid (LCG). Finally, we present early results from detector commissioning with cosmic ray events and plans for the ATLAS Computing System Commissioning test in 2006.
        Speaker: Dr Monica Verducci (European Organization for Nuclear Research (CERN))
        Paper
      • 26
        Connecting WLCG Tier-2 centers to GridKa
        GridKa, the German Tier-1 center in the Worldwide LHC Computing Grid (WLCG), supports all four LHC experiments, ALICE, ATLAS, CMS and LHCb as well as currently some non-LHC high energy physics experiments. Several German and European Tier-2 sites will be connected to GridKa as their Tier-1. We present technical and organizational aspects pertaining the connection and support of the Tier-2s sites. The experiments' computing models and use cases are analyzed in order to design and dimension the storage systems and network infrastructure at GridKa in such a way that the expected data streams can be written and delivered at all times during Service Challenge 4 and the LHC service. The current and planned layout of these systems is described. First results of data transfer tests between GridKa and the Tier-2 sites are shown.
        Speaker: Dr Andreas Heiss (FORSCHUNGSZENTRUM KARLSRUHE)
        Paper
        Paper sources
        Poster
      • 27
        CREAM: A simple, Grid-accessible, Job Management System for local Computational Resources
        Efficient and robust system for accessing computational resources and managing job operations is a key component of any Grid framework designed to support large distributed computing environment. CREAM (Computing Resource Execution And Management) is a simple, minimal system designed to provide efficient processing of a large number of requests for computation on managed resources. Requests are accepted from distributed clients via a web-service based interface. The CREAM architecture is designed to be a robust, scalable and fault tolerant service of a Grid middleware. In this paper we describe the CREAM architecture and the provided functionality. We also discuss how CREAM is integrated within the EGEE gLite middleware in general, and with the gLite Workload Management System in particular.
        Speaker: Moreno Marzolla (INFN Padova)
        Paper
        Poster
      • 28
        Distributed Analysis Experiences within D-GRID
        The German LHC computing resources are built on the Tier 1 center at Gridka in Karlsruhe and several planned Tier 2 centers. These facilities provide us with a testbed on which we can evaluate current distributed analysis tools. Various aspects of the analysis of simulated data using LCG middleware and local batch systems have been tested and evaluated. Here we present our experiences with the deployment, maintenance and operation of the tools.
        Speaker: Dr Johannes Elmsheuser (Ludwig-Maximilians-Universitat München)
        Paper
        Poster
      • 29
        DOSAR: A Distributed Organization for Scientific and Academic Research
        Hadron Collider experiments in progress at Fermilab’s Tevatron and under construction at the Large Hadron Collider (LHC) at CERN will record many petabytes of data in pursuing the goals of understanding nature and searching for the origin of mass. Computing resources required to analyze these data far exceed the capabilities of any one institution. The computing grid has long been recognized as a solution to this and other problems. The success of the grid solution will crucially depend on having high-speed network connections, the ability to use general-purpose computer facilities, and the existence of robust software tools. A consortium of universities in the US, Brazil, Mexico and India are developing a fully realized grid that will test this technology. These institutions are members of the DØ experiment at the Tevatron and the ATLAS or CMS experiments at the LHC, and form the Distributed Organization for Scientific and Academic Research (DOSAR). DOSAR is a federated grid organization encompassing numerous institutional grids. While founded for HEP research DOSAR forms the nucleus of grid infrastructure organization on the constituent campuses. DOSAR's strategy is to promote multi-disciplinary use of grids on campus and among the institutions involved in the consortium. DOSAR enables researchers and educators at the federated institutions to access grid resources outside the HEP context and is a catalyst in establishing state-wide grid structures. DOSAR is an operational grid which is a Virtual Organization (VO) in the Open Science Grid (OSG). In this talk, we will describe the architecture of the DOSAR VO, the use and functionality of the grid, and the experience of operating the grid for simulation, reprocessing and analysis of data from the DØ experiment. A software system for large-scale grid processing will be described. Our experience with high-speed intercontinental network connections will also be discussed.
        Speaker: Prof. Patrick Skubic (University of Oklahoma)
        Paper
      • 30
        Evaluation of Virtual Machines for HEP Grids
        The heterogeneity of resources in computational grids, such as the Canadian GridX1, makes application deployment a difficult task. Virtual machine environments promise to simplify this task by homogenizing the execution environment across the grid. One such environment, Xen, has been demonstrated to be a highly performing virtual machine monitor. In this work, we evaluate the applicability of Xen to scientific computational grids. We verify the functionality and performance of Xen, focusing on the execution of software relevant to the LHC community. A variety of production deployment strategies are developed and tested. In particular, we compare the execution of job-specific and generic VM images on grids of conventional Linux clusters as well as virtual clusters.
        Speaker: Dr Ashok Agarwal (Univeristy of Victoria)
        Paper
      • 31
        Experiences with operating SamGrid at the GermanGrid centre "GridKa" for CDF
        The German Grid computing centre "GridKa" offers large computing and storing facilities to the Tevatron and LHC experiments, as well as BaBar and Compass. It has been the first large scale CDF cluster to adopt and use the FermiGrid software "SAM" to enable users to perform data-intensive analyses. The system has been operated on production level for about 2 years. We review the challenges and gains of a cluster shared by many experiments and the operation of the SamGrid software in this context. Special focus will be given to the integration of the university based cluster (EKPPLus) at the University of Karlsruhe, as well needs and use-cases of users who wish to integrate the software into their existing analyses.
        Speakers: Dr Thomas Kuhr (UNIVERSITY OF KARLSRUHE, GERMANY), Mr Ulrich Kerzel (UNIVERSITY OF KARLSRUHE, GERMANY)
        Paper
        Poster
        Proceedigs-TGZ
        Proceedings
      • 32
        Feasibility of Data Acquisition Middleware based on Robot technology
        Recent Information Technology (IT) grows quickly and it is not so easy for us to adopt the software from IT into data acquisition (DAQ) because the software from the IT sometimes depends on OSs,languages and communication protocols. The dependency is not convenient to construct data acquisition software and then an experimental group makes their own DAQ software according to their own requirement. In robot technology field which needs a real-time system and is an embedded filed, they also have same problems and then try making Robot Technology Middleware (RTM) because they developed their own software and there was no common framework to make the robot as software point of view. The robot technology has similar functionality to the data acquisition for physics experiments. We studied the data acquisition based on the robot technology and then discussed the data acquisition middleware.
        Speaker: Dr Yoshiji Yasu (KEK)
        Paper
        Poster
      • 33
        First step in C# and mono
        We want to do a short communication to present our first experience in C# and mono within an OpenScientist context. Mainly attempt to integrate Inventor within a C# context then within the native GUI API coming with C#. We want to point out too the perspectives, for example within AIDA.
        Speaker: Mr Laurent GARNIER (LAL-IN2P3-CNRS)
      • 34
        Geant4 Muon Digitization in the ATHENA Framework
        The Muon Digitization is the simulation of the Raw Data Objects (RDO), or the electronic output, of the Muon Spectrometer. It has been recently completely re-written to run within the Athena framework and to interface with the Geant4 Muon Spectrometer detector simulation. The digitization process consists of two steps: in the first step, the output of the detector simulation, henceforth referred to as Muon Hits, is converted to muon digits, i.e., intermediate objects that can be fed into the reconstruction. In the second step, the muon digits are converted into RDO, the transient representation of raw the data byte stream. We will describe the detailed implementation of the first step of the muon digitization, where the detector simulation output is "digitized" into muon digits. We will describe the fundamentals of the Muon Digitization algorithms, outlining the global structure of the Muon Digitization, with some emphasis on the simulation of piled-up events. We will also describe the details of the digitization validation against the Monte Carlo information.
        Speaker: Daniela Rebuzzi (Istituto Nazionale de Fisica Nucleare (INFN))
      • 35
        Grid Deployment Experiences: The Evolution of the LCG information system.
        Since CHEP2005, the LHC Computing Grid (LCG) has grown from 30 sites to over 160 sites and this has increased the load on the informations system. This paper describes the recent changes to information system that were necessary to keep pace with the expanding grid. The performance of the a key component, the Berkley Database Information Index (BDII), is given special attention. During deployment it was found that the idea of Virtual Organization (VO) BDIIs was not sufficient and how this lead to the development of the Freedom of Choice for Resources (FCR) mechanism to enable VOs to have more control over their production. Other improvements are also mentioned including the work to upgrade the Glue to version 1.2 and plan for Glue version 2.
        Speaker: Mr Laurence Field (CERN)
        Paper
        Poster
      • 36
        Grid Deployment Experiences: The evaluation and initial deployment of R-GMA for production quality monitoring.
        This paper describes the introduction of Relation Grid Monitoring Architecture (R-GMA) into the LHC Computing Grid (LCG) as a production quality monitoring system and how, after an initial period of production hardening, it performed during the LCG Service Challenges. The results from the initial evaluation and performance tests are presented as well as the process of integrating R-GMA into the Site Functional Tests (SFT). The first real end to end application using R-GMA for monitoring file transfers is described in detail, and how this was used for the LHC Service Challenge. The job monitoring application, which handles approximately 24K state messages per day, is also described along with the initial feedback from the users. Metrics were used to record the performance of R-GMA in these applications. These metrics are presented along with a detailed analysis. The paper finally summarizes the experiences from this period and suggests some direction for the future.
        Speaker: Mr Laurence Field (CERN)
        Paper
        Poster
      • 37
        Integration of an AFS-based Sun Grid Engine site in a LCG grid
        The LHC's Computing Grid (LCG) middleware interfaces at each site with local computing resources provided by a batch system. However, currently only the PBS/Torque, LSF and Condor resource management systems are supported out of the box in the middleware distribution. Therefore many computing centers serving scientific needs other than HEP, which in many cases use other batch systems like Sun's Grid Engine (SGE), are not integrated into the Grid. Binding a site running on the SGE batch system is possible thanks to The London e-Science Centre's Globus JobManager and Information Reporter components. However when using AFS instead of plain NFS as shared filesystem some other issues arise. In collaboration with the Max Planck Institute for Physics (Munich), and as part of Forschungszentrum Karlsruhe's involvement in the MAGIC Grid project, we set up an LCG interface to Munich's Sun Grid Engine batch system. This SGE-based cluster is currently at a remote location (Garching's Computing Center) and a secure jobsubmission is achieved relying upon the usage of the AFS filesystem. This allows an established "non-grid" computing center to offer its resources via the Grid without any changes in the running infrastructure.
        Speaker: Dr Ariel Garcia (Forschungszentrum Karlsruhe, Karlsruhe, Germany)
        Poster
      • 38
        Integration of graphviz within OnX
        We want to do a short communication of a job done at LAL about integrating the graphviz library within the OnX environment. graphviz is a well known library good at visualizing a scene containing boxes connected by lines. The strength of this library is in the routing algorithms that permit to connect boxes. For example, graphviz is used by Doxygen to produce class diagrams. We want to present an attempt at integrating directly the graphvis graphics in our context (OnX and OpenScientist). By "direct", we mean by not passing by intermediate image files but by using the native library directly.
        Speaker: Mr Laurent GARNIER (LAL-IN2P3-CNRS)
        Paper
        Poster
      • 39
        Interface between data handling system (SAM) and CDF experiment software
        CDF has recently changed its data handling system from the DFC (Data File Catalogue) system to the SAM (Sequential Access to Metadata) system. This change was done as a preparation for distributed computing because SAM can handle distributed computing and provides mechanisms which enable it to work together with GRID systems. Experience shows that the usage of a new data handling system increases rapidly if it incorporates as many use cases from the old system as possible and has an interface which is similar to the old system. This is also true for the transition of the DFC system to the SAM system. The CDF Analysis Input and Output Modules are the preferred interfaces to access input data and direct output data in a standard analysis job. The Input Module invokes the data handling system in order to manage the data delivery to the job and to handle exceptions if problems with the data handling are experienced. The change of the data handling system has been done in such a way that it is nearly transparent to the user but still gives the user the choice which system to use. During the transition to the SAM data handling system methods have been tested to give reliable access to the data in case that the underlying storage mechanism (DCache) is failing. Another big issue of the interface is recovery of jobs which have failed. At FNAL most of the analysis jobs are running at the CAF (Central Analysis Farm). The CAF is basically a batch system, but has been expanded for example with the features of authentication. Therefore it is possible to distribute the computing for CDF with so-called DCAF systems (Decentralized CAF systems). The CAF system provides at the beginning and the end of the job interfaces to SAM, so it can be used to automatically recover jobs which failed. The Input Module is able to provide the necessary interfaces to the CAF to make the recovery possible. To implement this mechanism is good step in the direction of reliable jobs on the GRID.
        Speaker: Valeria Bartsch (FERMILAB / University College London)
        Paper
      • 40
        L-STORE: A FLEXIBLE LOGISTICAL STORAGE FRAMEOWRK FOR DISTRIBUTED, SCALABLE AND SECURE ACCESS TO LARGE-SCALE DATA
        Storing and accessing large volumes of data across geographically separated locations or cutting across labs and universities in a transparent, reliable fashion is a difficult problem. There is urgency to this problem with the commissioning of the LHC around the corner (2007). The primary difficulties that need to be over come in order to address this problem are policy driven secure access, mirroring and striping of data for reliable storage, scalability, and interoperability between diverse storage elements. This paper presents a flexible storage framework called L-Store (logistical storage) to address these issues. L-Store is conceptually designed using software agent technology and the Internet Backplane Protocol. The software agents provide scalability as the L-Store components can be distributed over multiple machines. L-Store provides rich functionalities in the form of certificate based authentication, mirroring and striping of data (fault tolerance), policy based data management, and transparent peer-to-peer interoperability of backend storage media. Keeping in mind the scenario where different Tiers and virtual organizations can have different types of storage elements (SE), L-Store is designed to have a common storage resource manager (SRM) compliant interface such that any SRM compliant SE can share data with an L-Store system. L-Store is agnostic to the underlying hardware and can be installed on simple personal computers having a disk cache to a full fledged hierarchical storage system (with tapes and disk backups).
        Speaker: Dr Surya Pathak (Vanderbilt University)
      • 41
        L-TEST: A FRAMEWORK FOR SIMPLIFIED TESTING OF DISTRIBUTED HIGH-PERFORMANCE COMPUTER SUB-SYSTEMS
        Introducing changes to a working high-performance computing environment is typically both necessary and risky. Testing these changes can be highly manpower intensive. L-TEST supplies a framework that allows the testing of complex distributed systems with reduced configuration. It reduces setting up a test to implementing the specific tasks for that test. L-TEST handles three jobs that must be performed for any distributed test; task communication to move tasks to execution nodes, generation of reproducible stochastic distributions of tasks, and collection of test results. Tasks are communicated via a dynamic and configurable set of storage systems, these storage systems can be reused for result collection, or a parallel set of systems may be set up for this results. The task generation framework supplies a basic set of stochastic generators along with framework code for calling these generators. The full workload of tasks is generated by aggregating multiple generator instances, in order to allow complex configuration of tasks. Although L-TEST does not restrict the tester to the following cases, this paper identifies several use cases that are of particular interest. The development of the L-STORE distributed file-system required testing for both correctness and performance. This paper describes how L-TEST was used to test both. Reads and write performance data, and integrity data were reported to separate communicators and analyzed separately. The performance configuration of L-TEST was also utilized, almost unchanged, to test a parallel file-system introduced to the ACCRE parallel cluster. In addition to testing the performance and integrity of file-systems, we describe how L-TEST can test the effect of planned changes on several characteristics of a cluster supercomputer; these include network bandwidth and latency and the task scheduling system for submission of jobs to the cluster.
        Speaker: Mr Laurence Dawson (Vanderbilt University)
      • 42
        Large scale data movement on the GRID
        During last few years ATLAS has ran a serie of Data Challenges producing simulated data used to understand the detector performace. Altogether more than 100 terabytes of useful data are now spread over few dozens of storage elements on the GRID. With the emergence of Tier1 centers and constant restructuring of storage elements there is a need to consolidate the data placement in a more optimal way. We have organised and exercised the ATLAS wide data consolidation using the ATLAS distributed data management system (DQ2). Experience with the massive data movement on the GRID will be reported.
        Speaker: Dr Pavel Nevski (BROOKHAVEN NATIONAL LABORATORY)
      • 43
        Large scale, grid-enabled, distributed disk storage systems at the Brookhaven National Lab RHIC/ATLAS Computing Facility
        The Brookhaven RHIC/ATLAS Computing Facility serves as both the tier-0 computing center for RHIC and the tier-1 computing center for ATLAS in the United States. The increasing challenge of providing local and grid-based access to very large datasets in a reliable, cost-efficient and high-performance manner, is being addressed by a large-scale deployment of dCache, the distributed disk caching system developed by DESY/FNAL. Currently in production for the PHENIX and ATLAS experiments, dCache is employing the same worker nodes utilized by the RHIC and ATLAS analysis clusters, making use of the large amount of low-cost, locally-mounted disk space available on the computing farm. Within the hybrid storage/computing model, the worker nodes function simultaneously as file servers and compute elements, providing for a cost-effective, high throughput data storage system. dCache also serves as a caching front-end to the HPSS Mass Storage System, where access to the data on tape is provided through an integrated optimizing layer that was developed at BNL. BNL's dCache functions as SRM-based Storage Element in the context of OSG and LCG. It has been serving on a production scale at BNL since November 2004, exhibiting quality performance through a number of Service Challenges and US ATLAS production runs. This presentation will cover the design and usage of this system, including performance metrics and scalability considerations as the facility expands toward an expected petabyte scale deployment in 2007.
        Speakers: Dr Ofer Rind (Brookhaven National Laboratory), Ms Zhenping Liu (Brookhaven National Laboratory)
        Paper
        Poster
      • 44
        Lattice QCD Clusters at Fermilab
        As part of the DOE SciDAC "National Infrastructure for Lattice Gauge Computing" and DOE LQCD Projects, Fermilab builds and operates production clusters for lattice QCD simulations for the US community. We currently operate two clusters: a 128-node Pentium 4E Myrinet cluster, and a 520-node Pentium 640 Infiniband cluster. We discuss the operation of these systems and examine their performance in detail. We will also discuss the 1000-processor Infiniband cluster planned for Summer 2006.
        Speaker: Dr Donald Holmgren (FERMILAB)
        Paper
      • 45
        Lavoisier: A Data Aggregation and Unification Service
        It is broadly admitted that grid technologies have to deal with heterogeneity in both computational and storage resources. In the context of grid operations, heterogeneity is also a major concern, especially for worldwide grid projects as LCG and EGEE. Indeed, the usage of various technologies, protocols and data formats induces complexity. As learned from our experience on participating to the collaborative development of a tool set for LCG/EGEE operations, this complexity increases the risk of unreliable code, lack of reactivity and a extremely low level of reusability. To reduce these risks, we need an extensible tool for reliably aggregating heterogeneous data sources, for executing cross data sources queries and for exposing the collected information for several usages. In this paper we present "Lavoisier", an extensible service for providing an unified view of data collected from multiple heterogeneous data sources. This service transforms the collected data and represents it as XML documents: this allows for transparently and efficiently executing XSL queries on them. The service also exposes the data through standard protocols and interfaces (WSRF). In addition, its efficiency can be optimized by tuning the provided cache mechanisms, according to both the characteristics and the usage profile of the data coming out of each source (access frequency, amount of data, latency, ...). The consistency of the exposed data can be ensured by specifying dependencies between the cached data. We also present some cases where Lavoisier has proven effective, like the usage that is now done by the "LCG/EGEE CIC portal" a central tool for the operations of the LCG/EGEE grid infrastructure. We see how in this particular case, Lavoisier allowed the portal developers to reduce its code complexity while increasing the number of offered features and provided the possibility to serve the collected data to other operations tools.
        Speaker: Mr Sylvain Reynaud (IN2P3/CNRS)
        Paper
        Poster
      • 46
        LcgCAF - The CDF portal to the gLite Middleware
        The increasing instantaneous luminosity of the Tevatron collider will soon cause the computing requirements for data analysis and MC production to grow larger than the dedicated CPU resources that will be available. In order to meet future demands, CDF is investing in shared, Grid, resources. A significant fraction of opportunistic Grid resources will be available to CDF before LHC era starts and CDF could greatly benefit from using them. CDF is therefore reorganizing its computing model to be integrated with the new Grid model. LcgCAF builds upon the gLite Middleware in order to establish a standard CDF environment transparent to the end-users. LcgCAF is a suite of software entities that handle authentication/security, job submission and monitoring and data handling tasks. CDF authentication and security are entirely based on the Kerberos 5 system, so we needed to develop a kerberos certificate renewing service based on GSI certificates, to guarantee job output transfer to CDF disk servers. An enqueuing and status monitoring functionality was introduced to make the CAF submission latencies independent of the gLite Work Load Management System ones. The CDF batch monitoring is presently based on information from the GridIce monitoring and the LCG Logging and Bookkeeping systems. Interactive monitoring is based on the Clarens Web Services Framework. The beta version of LcgCAF is already deployed and is able to run CDF jobs on most INFN-Grid sites.
        Speaker: Dr Armando Fella (INFN, Pisa)
      • 47
        LHCb Data Replication during SC3
        LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to allow high bandwidth distribution of data across the grid in accordance with the computing model. To enable reliable bulk replication of data, LHCb's DIRAC system has been integrated with gLite's File Transfer Service middleware component to make use of dedicated network links between LHCb computing centres. DIRAC's Data Management tools previously allowed the replication, registration and deletion of files on the grid. For SC3 supplementary functionality has been added to allow bulk replication of data (using FTS) and efficient mass registration to the LFC replica catalog. Provisional performance results have shown that the system developed can meet the expected data replication rate required by the computing model in 2007. This paper details the experience and results of integration and utilisation of DIRAC with the SC3 transfer machinery.
        Speaker: Mr Andrew Cameron Smith (CERN, University of Edinburgh)
        Minutes
        Paper
        Poster
        tex file
      • 48
        Lightweight deployment of the SAM grid data handling system to new experiments.
        The SAM data handling system has been deployed successfully by the Fermilab D0 and CDF experiments, managing Petabytes of data and millions of files in a Grid working environment. D0 and CDF have large computing support staffs, have always managed their data using file catalog systems, and have participated strongly in the development of the SAM product. But we think that SAM's long term viability requires a much wider deployment to variety of future customers, with minimal support and training cost and without customization of the SAM software. The recent production deployment of SAM to the Minos experiment has been a good first step in this direction. Minos is a smaller experiment, with under 30 terabytes of data in about 600,000 files, and no history of using a file catalog. We will discuss the Minos deployment and its short time scale, how it has provided useful new capabilities to Minos, and where we have room for improvement. The acceptance of SAM by Minos has depended critically on several new capabilities of SAM, including the C++ API, the frozen client software, and SAM Web Services. We will discuss lessons learned, will speculate on future deployments, and will invite feedback from the audience in this regard.
        Speaker: Arthur Kreymer (FERMILAB)
        Paper
      • 49
        Managing gLite with Quattor: Implementation and Deployment Experience
        gLite is the next generation middleware for grid computing. Born from the collaborative efforts of more than 80 people in 12 different academic and industrial research centers as part of the EGEE Project, gLite provides a bleeding-edge, best-of-breed framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet. Currently, gLite is composed of more than 25 different services, implemented in different languages, using different technologies and all coming with individual configuration needs. In addition gLite can be run in multiple operational scenarios and supports presently hundreds of configuration options. Past experience has shown that configuration and management are one of the biggest challenges of such a system. As part of the investigations of ways of configuring, deploying and managing gLite, the Quattor system was tested as a candidate for large-scale installations. This presentation will discuss how the functionality provided by Quattor has been experimentally applied in the context of the internal gLite testbed management. Different aspects are described ranging from the Quattor server installation itself to the population of the Software Repository and the definition of the Configuration Database. The paper also shows how the required information can be generated automatically from the gLite configuration files and build dependencies lists in order to allow seamless integration of gLite within the Quattor system. The most challenging part, the service lifecycle management, has also been addressed. A Quattor NCM component has been developed to transform the Quattor data structures into gLite configuration files and to act on the gLite configuration scripts to reconfigure the service as information change. The future steps and possible areas of improvements are described.
        Speaker: Mr Marian ZUREK (CERN, ETICS)
        Paper
      • 50
        Monitoring and Accounting within the ATLAS Production System
        The presented monitoring framework builds on the experience gained during the ATLAS Data Challenge 2 and Rome physics workshop productions. During these previous productions several independent monitoring tools were created. Although these tools were created to some degree in isolation they provided a good degree of complementary functionality and are taken as a basis for the current framework. One of the main design goals of the current framework is to abstract the monitoring away from the central database of jobs, thus reducing the impact which the monitoring has on the production itself. Furthermore, the framework is aimed towards providing a common monitoring environment which may be seen as a high level source of information covering the 3 grid flavours used for ATLAS productions. The functionality of the framework is described with attention being paid to design considerations and implementation. The experience gained during the project is presented along with an outlook towards future developments.
        Speaker: Dr john kennedy (ATLAS)
        Paper
      • 51
        Multiple Virtual Databases to support multiple VOs in R-GMA
        R-GMA is a relational implementation of the GGF's Grid Monitoring Architecture (GMA). In some respects it can be seen as a virtual database (VDB), supporting the publishing and retrieval of time-stamped tuples. The scope of an R-GMA installation is defined by its schema and registry. The schema holds the table definitions and, in future, the authorization rules. The registry holds a list of the available producers and consumers. At present, while it is possible to have multiple installations, a user can only use one at a time and hence cannot access tables in another installation. We plan to introduce multiple VDBs, where each VDB is defined by its own registry and schema. In this paper we explain the basic idea of R-GMA, why we need multiple VDBs to support multiple VOs, and how we will implement them. We also discuss the possible need to create some VDBs not related to end-user VOs. We also explain why we do not plan to provide a catalogue of VDBs as a part of R-GMA.
        Speaker: Mr A.J. Wilson (Rutherford Appleton Laboratory)
        Paper
      • 52
        Non-Java Web Service hosting for Grids
        GridSite provides a Web Service hosting framework for services written as native executables (eg in C/C++) or scripting languages (such as Perl and Python.) These languages are of particular relevance to HEP applications, which typically have large investments of code and expertise in C++ and scripting languages. We describe the Grid-based authentication and authorization environment that GridSite provides, removing the need for services to manipulate Grid credentials (such as X.509, GSI and VOMS) themselves. We explain how the GRACE model (GridSite - Apache - CGI - Executables) allows Unix-account sandboxing of services, and allows sites to provide hosting of multiple services provided by third-parties (such as HEP experimental collaborations) on the same server. Finally, we propose scenarios which combine GridSite's authorization model with service sandboxing to allow remote deployment of services.
        Speaker: Dr Andrew McNab (UNIVERSITY OF MANCHESTER)
        Paper
      • 53
        Numerical simulation of the beam dynamics in storage rings with electron cooling
        BETACOOL program developed by JINR electron cooling group is a kit of algorithms based on common format of input and output files. The program is oriented to simulation of the ion beam dynamics in a storage ring in presence of cooling and heating effects. The version presented in this report includes three basic algorithms: simulation of r.m.s. parameters of the ion distribution function evolution in time, simulation of the distribution function evolution using Monte-Carlo method and tracking algorithm based on molecular dynamics technique. General processes to be investigated with the program are intrabeam scattering in the ion beam, electron cooling, interaction with residual gas and internal target.
        Speaker: Dr Grigory Trubnikov (Joint Institute for Nuclear Research, Dubna)
        Poster
      • 54
        Operating a Tier1 centre as part of a grid environment
        Forschungszentrum Karlsruhe is one of the largest science and engineering research institutions in Europe. The resource centre GridKa as part of this science centre is building up a Tier 1 centre for the LHC project. Embedded in the European grid initiative EGEE, GridKa also manages the ROC (regional operation centre) for the German Swiss region. The management structure of the ROC and its integration into the regional operation is explained. By discussing existing and future tools for operating and monitoring the grid, the development of a robust grid infrastructure in the German Swiss region will be shown. Experience in operating the grid from the view of a Tier 1 centre and as a regional operation centre is summarized with respect to integration of grid tools in the German Swiss federation. In addition the progress to build a stable grid infrastructure in our region of about 13 resource centres is pointed out taking into account the new support structures. The start of the support workflow can be a user specific problem as well as a problem detected by regularly performed general site functional tests. Different views of the regional grid structure will be highlighted.
        Speaker: Dr Sven Hermann (Forschungszentrum Karlsruhe)
        Paper
        Poster
      • 55
        Operation and Management of a Heterogeneous Large-Scale, Multi-Purpose Computer Cluster at Brookhaven National Lab
        The operation and management of a heterogeneous large-scale, multi-purpose computer cluster is a complex task given the competing nature of requests for resources by a large, world-wide user base. Besides providing the bulk of the computational resources to experiments at the Relativistic Heavy-Ion Collider (RHIC), this large cluster is part of the U.S. Tier 1 Computing Center for the ATLAS experiment at the LHC, and it provides support to the Large Synoptic Survey Telescope (LSST) project. A description of the existing and planned upgrades in infrastructure, hardware and software architecture that allow efficient usage of computing and distributed storage resources by a geographically diverse user base will be given, followed by a description of near and medium-term computing trends that will play a role in the future growth and direction of this computer cluster.
        Speaker: Dr Tony Chan (BROOKHAVEN NATIONAL LAB)
      • 56
        Overview of STAR Online Control Systems and Experiment Status Information
        For any large experiment with multiple sub-systems and their respective experts spread throughout the world, real-time and near-real-time information accessible to a wide audience is critical to efficiency and success. Large and varied amounts of information about the current and past state of facilities and detector systems are necessary, both for current running, and for eventual data analysis. As an example, the STAR Control Room's internal interactions and presentation of information to the external “offline” world will be described in brief. Conceptual network layout, types of information exchanged and methods of information dissemination will be presented. Focus will not be on the flow of physics data per se, but on the information about the status and control of the experimental systems in the course of acquiring the physics data.
        Speaker: Mr Wayne BETTS (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Poster
      • 57
        Prototype of the Swiss ATLAS Computing Infrastructure
        The Swiss ATLAS Computing prototype consists of clusters of PCs located at the universities of Bern and Geneva (Tier 3) and at the Swiss National Supercomputing Centre (CSCS) in Manno (Tier 2). In terms of software, the prototype includes ATLAS off-line releases as well as middleware for running the ATLAS off-line in a distributed way. Both batch and interactive use cases are supported. The batch use case is covered by a country wide batch system, the interactive use case is covered by a parallel execution system running on single clusters. The prototype serves the dual purpose of providing resources to the ATLAS production system and providing Swiss researchers with resources for individual studies of both simulated data and data from the ATLAS test beam. In this article the solutions used for achieving this are presented. Initial experience with the system is also described.
        Speaker: Dr Szymon Gadomski (UNIVERSITY OF BERN, LABORATORY FOR HIGH ENERGY PHYSICS)
        Paper
      • 58
        Public Resource Computing and Geant4
        Projects like SETI@home use computing resources donated by the general public for scientific purposes. Many of these projects are based on the BOINC (Berkeley Open Interface for Network Computing) software framework that makes it easier to set up new public resource computing projects. BOINC is used at CERN for the LHC@home project where more than 10000 home users donate time of their CPUs to run the Sixtrack application. The LHC@home project has recently delivered the computing power of about three Teraflops, which makes it interesting also for other applications that could accept the constraints imposed by the BOINC model that requires simple, relatively small, CPU bound programs that can run on a sandbox. Once these constraints are met, BOINC allows thousands of different instances of the programs to run in parallel. The use of Geant4 in a public resource computing project has also been studied at CERN. After contacts with developers we found that BOINC could be useful to run the GEANT4 release testing process that was found to be a good case study to explore what we could do for more complex HEP simulations. This is a simple test beam set-up to compare physics results produced by different program versions which allows validating new versions. Therefore we ported the GEANT4 release testing software to the BOINC environment both in Windows and Linux and set up a BOINC server to demonstrate a production environment. The benefits and limitations of BOINC based projects for running Geant4 are presented.
        Speaker: Dr Jukka Klem (Helsinki Institute of Physics HIP)
        Paper
      • 59
        Recent User Interface Developments in ROOT
        Providing all components and designing good user interfaces requires from developers to know and apply some basic principles. The different parts of the ROOT GUIs should fit and complete each other. They must form a window via which users see the capability of the software system and understand how to use them. If well-designed, the user interface adds quality and inspires confidence and trust of the users. Its main goal is to help them to have an easier time getting their jobs accomplished. In this poster, we present the relationship between two main user interface projects in this direction: the ROOT object editors and the style manager.
        Speaker: Mr Fons Rademakers (CERN)
        Poster
      • 60
        Reconstruction and calibration strategies for the LHCb RICH detector
        The LHCb experiment will make high precision studies of CP violation and other rare phenomena in B meson decays. Particle identification, in the momentum range from ~2-100 GeV/c, is essential for this physics programme, and will be provided by two Ring Imaging Cherenkov (RICH) detectors. The experiment will use several levels of trigger to reduce the 10MHz rate of visible interactions to the 2kHz that will be stored. The final level of the trigger has access to information from all sub-detectors. The standard offline RICH reconstruction involves solving a quartic equation that describes the RICH optics for each hit in the RICH detector, then using a global likelihood minimization, combining the information from both RICH detectors along with tracking information, to determine the best particle hypotheses. This approach performs well but is vulnerable to background from rings without associated tracks. In addition, the time needed to run the algorithm is of the order of 100 ms per event which is to be compared with the time of order 10 ms available to run the entire final level trigger. Alternative RICH reconstruction algorithms are being investigated that complement the standard procedure. First, algorithms of greater robustness, less reliant on the tracking information, are being developed, using techniques such as Hough transforms and Metropolis Hastings Markov chains. Secondly, simplified algorithms with execution times of order 3 ms, suitable for use in the online trigger are being evaluated. Finally, optimal performance requires a calibration procedure that will enable the performance of the pattern recognition to be measured from the experimental data. This paper describes the the performance of the different RICH reconstruction algorithms studied and reports on the strategy for RICH calibration in LHCb.
        Speakers: Cristina Lazzeroni (University of Cambridge), Dr Raluca-Anca Muresan (Oxford University)
        paper PDF file
        tar of TEX and EPS files
      • 61
        Resilient dCache: Replicating Files for Integrity and Availability.
        dCache is a distributed storage system currently used to store and deliver data on a petabyte scale in several large HEP experiments. Initially dCache was designed as a disk front-end for robotic tape storage file systems. Lately, dCache systems have been increased in scale by several orders of magnitude and considered for deployment in US-CMS T2 centers lacking expensive tape robots. This necessitated storing data for extended periods of time on disk-only storage systems, in many cases using very inexpensive commodity (non-RAID) disk devices purchased specifically for storage or opportunistically exploiting spare disk space in computing farms. Hundreds of Terabytes of storage may be added for little additional cost. The large number of nodes in computing clusters and the lesser reliability of commodity disks and computers leads to a higher likelihood for individual files to become lost or unavailable in normal operations. Resilient dCache is a new top level dCache service created to address these reliability and file availability issues by keeping several replicas of each logical file on elements of different dCache disk hardware. The Resilience Manager automatically keeps the number of copies in the system within a specified range when files are stored in or removed from dCache, or disk pool nodes are found to have crashed, been removed from, or added to the system. The Resilience Manager maintains a local file replica catalog and disk pool configuration in Postgres DB. The paper describes the design of dCache Resilience Manager and experience in the production deployment and operations in US-CMS T1 and T2 centers. We use the configuration "all pools are resilient" in US-CMS T2 centers to store generated data before they are stored in T1 center. The US-CMS T1 center has some pools in the single dCache system configured as resilient, while the other pools are tape-backed or volatile. Such a configuration simplifies the administration of the system and data exchange. We attribute the increase in amount of data delivered to compute nodes from dCache US-CMS T1 center (0.2 PB/day in October 2005) to the data stored in resilient pools.
        Speaker: Mr Timur Perelmutov (FNAL)
      • 62
        ROOT 2D graphics visualisation techniques
        ROOT 2D graphics offers a wide set of data representation and visualisation techniques. Over the years, responding to user comments and requests, these have been improved and enriched. The current system is very flexible and can easily be tuned to meet user imagination. We present a patchwork demonstrating the wide variety of output which can be produced.
        Speaker: Rene Brun (CERN)
        Poster
      • 63
        ROOT 3D graphics overview and examples
        Overview and examples of: -Common viewer architecture (TVirtualViewer3D interface and TBuffer3D shape hierarchy) used by all 3D viewers. -Significant features in the OpenGL viewer - in pad embedding, render styles, composite (CSG/Boolean) shapes and clipping.
        Speaker: Rene Brun (CERN)
        Poster
      • 64
        ROOT/CINT/Reflex integration
        Reflex is a package, which enhances C++ with reflection capabilities. It was developed in the LCG Applications Area at CERN and recently it was decided that it will be tightly integrated with the ROOT analysis framework and especially with the CINT interpreter. This strategy will unify the dictionary systems of ROOT/CINT and Reflex into a common one. The advantages of this move for ROOT/CINT will be better coherence to the C++ standard, less memory consumption of dictionary information and easier maintenance. This poster will focus on the evolutionary steps to be taken for this integration like the unification of data structures of CINT and Reflex while staying backwards compatible to user code. It will also discuss modifications for the generation of reflection information within ROOT, which is done via the rootcint program. Source code examples and class diagrams will give a look and feel of the Reflex package itself.
        Speaker: Dr Stefan Roiser (CERN)
      • 65
        Sailing the petabyte sea: navigational infrastructure in the ATLAS event store
        ATLAS has deployed an inter-object association infrastructure that allows the experiment to track at the object level what data have been written and where, and to assign both object-level and process-level labels to identify data objects for later retrieval. This infrastructure provides the foundation for opportunistic run-time navigation to upstream data, and in principle supports both dynamic determination of what data objects are reachable, and controlled-scope retrieval. This infrastructure is complementary to the coarser-grained bookkeeping and provenance management system used to identify which datasets were input to the production of which derived datasets, adding the capability to determine and locate the objects used to produce specific derived objects. It also simplifies the task of populating an event-level metadata system capable of returning references to events at any stage of processing. The tension between what the infrastructure can demonstrably support at site-level scales--it is already extensively utilized by ATLAS physicists--and what is expected to be constrained by policy in light of anticipated distributed storage resource limitations--is also discussed.
        Speaker: Dr David Malon (ARGONNE NATIONAL LABORATORY)
        Paper
      • 66
        SAMGrid Peer-to-Peer Information Service
        SAMGrid presently relies on the centralized database for providing several services vital for the system operation. These services are all encapsulated in the SAMGrid Database Server, and include access to file metadata and replica catalogs, dataset and processing bookkeeping, as well as the runtime support for the SAMGrid station services. Access to the centralized database and DB Servers represents a single point of failure in the system and limits its scalability. In order to address this issue, we have created a prototype of a peer-to-peer information service that allows the system to operate during times when access to the central DB is not available for any reason (e.g., network failures, scheduled downtimes, etc.), as well as to improve the system performance during times of extremely high system load when the central DB access is slow and/or has a high failure rate. Our prototype uses Distributed Hash Tables to create a fault tolerant and self-healing service. We believe that this is the first peer-to-peer information service designed to become a part of an in-use grid system. We describe here the prototype architecture and its existing and planned functionality, as well as show how it can be integrated into the SAMGrid system. We also present a study of performance of our new service under different circumstances. Our results strongly demonstrate the feasibility and usefulness of the proposed architecture.
        Speaker: Dr Sinisa Veseli (Fermilab)
        Paper
        Poster
      • 67
        SAMGrid Web Services
        SAMGrid is a distributed (CORBA-based) HEP data handling system presently used by three running experiments at Fermilab: D0, CDF and MINOS. User access to the SAMGrid services is provided via Python and C++ client APIs, which handle the low-level CORBA calls. Although the use of SAMGrid API's is fairly straightforward and very well documented, in practice SAMGrid users are facing numerous installation and configuration issues. SAMGrid Web Services have been designed to allow easy access to the system by using standard web service technologies and protocols (SOAP/XML, HTTP). In addition to hiding from users complexity of the system, these services eliminate the need for the proprietary CORBA-based clients, and also significantly simplify client installation and configuration. We present here the arhitecture and design of the SAMGrid Web Services, and describe the functionality that they currently offer. In particular, we discuss various dataset and cataloging services, as well as cover in more details the techniques used for delivering data files to end users. We also discuss service testing and performance measurements, deployment plans, as well as plans for future development.
        Speaker: Dr Sinisa Veseli (Fermilab)
        Paper
        Poster
      • 68
        Schema Evolution and the ATLAS Event Store
        The ATLAS event data model will almost certainly change over time. ATLAS must retain the ability to read both old and new data after such a change, regulate the introduction of such changes, minimize the need to run massive data conversion jobs when such changes are introduced, and maintain the machinery to support such data conversions when they are unavoidable. In database literature, such changes to the layout of persistent datastructures are known as schema evolution. Possible schema changes range from simple alterations (e.g., adding a new data member) through complex redesign of a class to multi-class refactoring of the data model. ATLAS uses POOL/ROOT as its principal event storage technology and benefits from ROOT automatic schema evolution. For more complex changes, manual schema evolution is implemented. The architecture of the Gaudi/Athena framework separates the transient and persistent worlds by a "conversion" layer. We take advantage of this separation to allow transformations from stored persistent shapes to an evolving transient class. ROOT custom streamers may be used to allow migration to an intermediate state representation model.
        Speaker: Dr Marcin Nowak (BROOKHAVEN NATIONAL LABORATORY)
        Paper
      • 69
        ScotGrid and the LCG.
        ScotGrid is a distributed Tier-2 computing centre formed as a collaboration between the Universities of Durham, Edinburgh and Glasgow, as part of the UK's national particle physics grid, GridPP. This paper describes ScotGrid's current resources by institute and how these were configured to enable participation in the LCG service challenges. In addition, we outline future development plans for ScotGrid hardware resources and plans for the optimisation of the ScotGrid networking infrastructure. Such improvements are necessary to enhance the quality of service that we provide to Grid users. Keywords: ScotGrid, Tier-2, GridPP, LCG.
        Speaker: Dr Philip Clark (University of Edinburgh)
        Paper
        Poster
      • 70
        Software management for the alignment system of the ATLAS Muon Spectrometer
        The muon spectrometer of the ATLAS experiment aims at reconstructing very high energy muon tracks (up to 1 TeV) with a transverse momentum resolution better than 10 %. For this purpose a resolution of 50 micrometer on the sagitta of tracks has to be achieved. Each muon track is measured with three wire chambers stations placed inside an air core toroid magnet (the chambers seat around the interaction point in three layers and 16 sectors). In particular, the contribution to the sagitta due to the limited knowledge of the chamber positions and deformations should not exceed 30 micrometer. Therefore a network of optical survey monitors, called alignment system is being installed. This network is made up of six different alignment types: i) the IN-PLANE alignment measures chamber internal deformation; the PRAXIAL system is composed of two parts: ii) the PROXIMITY part which gives the position of one chamber with respect to the neighbouring one and iii) the AXIAL part controls the “saloon door” effect of chambers relative position within a layer, iv) the PROJECTIVE system gives the chamber position within a triplet; v) the REFERENCE system is used to link a sector of chambers to the neighbouring sector; vi) the CCC system connects large chambers to small chambers since the lasts are not individually aligned. In this paper we will describe the software managing the complete system from calibration of individual sensors to implementation of the whole system including some test beam results.
        Speaker: Dr Valerie GAUTARD (CEA-SACLAY)
        Paper
        Poster
      • 71
        STAR Vertex reconstruction algorithms in the presence of abundant pileup.
        One of the world's largest time projection chambers (TPC) has been used at STAR for reconstruction of collisions at luminosities yielding thousands of piled-up background tracks resulting from few hundreds pp minBias background events or several heavy ion background events, respectively. The combination of TPC tracks and trigger detector data used for tagging of tracks are sufficient to disentangle the primary vertex and primary tracks from the pileup. In this paper we will focus on techniques for vertex reconstruction. A luminosity driven evolution of vertex finder algorithms at STAR will be sketched. We will make distinction between multiple primary vertices from the trigger bunch crossing (bXing) and reconstructing vertices associated with minimum bias collisions from early/late bXings. The vertex finder algorithm based on likelihood measures will be presented. We will compare it with a chi square-minima method (Minuit). Fine-tuning criteria for weighting the matching of TPC to fast detector data, taking into account efficiency, purity, trigger dependence, stability of calibrations, and benefits of external beam line constraints, will be discussed. The performance of the algorithm for real and simulated STAR pp data (including pileup) will be assessed. Extension of the algorithm for reconstruction vertex in CuCu and heavier ions collisions will be discussed.
        Speaker: Dr Jan BALEWSKI (Indiana University Cyclotron Facility)
      • 72
        The Calorimeter Event Data Model for the ATLAS Experiment at LHC
        The event data model for the ATLAS calorimeters in the reconstruction software is described, starting from the raw data to the analysis domain calorimeter data. The data model includes important features like compression strategies with insignificant loss of signal precision, flexible and configurable data content for high level reconstruction objects, and backward navigation from the analysis data at the highest extraction level to the full event data. The most important underlying strategies will be discussed in this talk.
        Speaker: Walter Lampl (Department of Physics, University of Arizona)
        Paper
      • 73
        The Evolving Role of Monitoring in a Large-Scale Computing Facility
        Monitoring a large-scale computing facility is evolving from a passive to a more active role in the LHC era, from monitoring the health, availability and performance of the facility to taking a more active and automated role in restoring availability, updating software and becoming a meta-scheduler for batch systems. This talk will discuss the experiences of the RHIC and ATLAS U.S. Tier 1 Computing Facility at Brookhaven National Lab in evaluating different monitoring software packages and how monitoring is being used to improve efficiency and to integrate the facility with the Grid environment. A monitoring model to link geographically dispersed, regional computer facilities which can be used to improve efficiency and throughput will be presented as well.
        Speaker: Dr Tony Chan (BROOKHAVEN NATIONAL LAB)
      • 74
        The Readout Crate in BESIII DAQ framework
        The BESIII “readout” is meant an interface between DAQ framework and FEEs. As a part of DAQ system, the readout plays a very important role in the process of data acquisition. The principle functionality of Readout Crate is to receive, repack, buffer and forward the data coming from FEEs to Readout PC. The implementation is based on commercial components: VMEbus PowerPC based single board computer; the VxWorks real-time operation system.The design and implementation of the functionality of Readout Crate will be presented in this paper. Keywords: BESIII, DAQ, Readout, VMEbus, VxWorks
        Speaker: Mr GUANGKUN LEI (IHEP)
        Paper
        Poster
      • 75
        The SAM-Grid / LCG interoperability system: a bridge between two Grids
        The SAM-Grid system is an integrated data, job, and information management infrastructure. The SAM-Grid addresses the distributed computing needs of the experiments of RunII at Fermilab. The system typically relies on SAM-Grid services deployed at the remote facilities in order to manage the computing resources. Such deployment requires special agreements with each resource provider and it is a labor intensive process. On the other hand, the DZero VO has also access to computing resources through the LCG infrastructure. In this context, resource sharing agreements and the deployment of standard middleware are negotiated within the framework of the EGEE project. The SAM-Grid / LCG interoperability project was started to let DZero users retain the user-friendlyness of the SAM-Grid interface, allowing, at the same time, access to the LCG pool of resources. This "bridging" between grids is beneficial for both the SAM-Grid and LCG, since it minimizes the deployment efforts of the SAM-Grid team and exercises the LCG computing infrastructure with data intensive production applications of a running experiment. The interoperability system is centered around job "forwarding" nodes, which receive jobs prepared by the SAM-Grid and submit them to LCG. This paper discusses the architecture of the system and how it addresses inherent issues of service accessibility and scalability. The paper also presents the operational and support challenges that arise to operate the system in production.
        Speaker: Garzoglio Gabriele (FERMI NATIONAL ACCELERATOR LABORATORY)
        Paper
        Poster
      • 76
        Transparently Distributing CDF Software with Parrot
        The CDF software model was developed with dedicated resources in mind. One of the main assumptions is to have a large set of executables, shared libraries and configuration files on a shared file system. As CDF is moving toward a Grid model, this assumption is limiting the general physics analysis to only a small set of CDF friendly sites with the appropriate file system installed. In order to exploit as many Grid resources as possible, we have looked at ways to lift this limit. Given the amount of users and existing applications, it is impractical to force the users to change their way of work and stop relying on the CDF software distribution. Instead, we are developing a solution that uses Parrot to transparently access CDF software remotely. Parrot is a user level tool that allows any executable or script to access remote files as if they were on a local file system. No special privileges are required to install or use Parrot, so it can easily be deployed on a Grid. Parrot supports several I/O protocols including HTTP, FTP, RFIO, and other protocols common in grid computing. Using HTTP and standard caching mechanisms, this allows applications to access a single copy of the CDF software distribution from anywhere in the world. In the talk we will present our experience with the use of Parrot, including the problems we experienced and how we solved them.
        Speaker: Dr Igor Sfiligoi (INFN Frascati)
        Paper
        Poster
      • 77
        VGM visualization within OnX
        We want to do a short communication of a job done at LAL to visualize, within the OnX interactive environment, HEP geometries accessed through the VGM abstract interfaces. VGM and OnX had been presented at the Interlaken CHEP'04.
        Speaker: Mr Laurent GARNIER (LAL-IN2P3-CNRS)
      • 78
        VLSI Implementation of Greedy based Distributed Routing Schemes for Ad Hoc NetWorks
        We describe a VLSI implementation based on FPGA of a new greedy algorithm for approximating minimum set covering in ad hoc wireless network applications. The implementation makes the algorithm suitable for embedded and real-time architectures.
        Speaker: Dr paolo branchini (INFN)
        Paper
        Poster
    • 12:30
      Lunch Break
    • Computing Facilities and Networking: CFN-1 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 79
        DNS load balancing and failover mechanism at CERN
        Availability approaching 100% and response time converging to 0 are two factors that users expect of any system they interact with. Even if the real importance of these factors is a function of the size and nature of the project, todays users are rarely tolerant of performance issues with system of any size. Commercial solutions for load balancing and failover are plentiful. Citrix NetScaler, Foundry ServerIron series, Coyote Point Systems Equalizer and Cisco Catalyst SLB switches, to name just a few, all offer industry standard approaches to these problems. Their solutions are optimized for standard protocol services such as HTTP, FTP or SSH but it remains difficult to extend them for other kinds of application. In addition to this, the granularity of their failover mechanisms are per node and not per application daemon, as is often required. Moreover, the pricing of these devices for small projects is uneconomical. This paper describes the design and implementation of the DNS load balancing and failover mechanism currently used at CERN. Our system is based around SNMP, which is used as the transport layer for state information about the server nodes. A central decision making service collates this information and selects the best candidate(s) for the service. IP addresses of the chosen nodes are updated in DNS using the DynDNS mechanism. The load balancing feature of our system is used for variety of standard protocols (including HTTP, SSH, (Grid)FTP, SRM) while the (easily extendable) failover mechanism adds support for applications like CVS and databases. The scale, in terms of the number of nodes, of the supported services ranges from a couple (2-4), up to around 100. The best known services using this mechanism at CERN are LXPLUS and CASTORGRID. This paper also explains the advantages and disadvantages of our system, and advice is given about when it is appropriate to be used. Last, but not least, given the fact that all components of our system are build around freely available open source products, our solution should be especially interesting in low resource locations.
        Speaker: Mr Vladimir Bahyl (CERN IT-FIO)
        Paper
        Slides
      • 80
        Using TSM to create a high-performance tape connection
        At GridKa an initial capacity of 1.5 PB online and 2 PB background storage is needed for the LHC start in 2007. Afterwards the capacity is expected to grow almost exponentially. No computing site will be able to keep this amount of data in online storage, hence a highly accessible tape connection is needed. This paper describes a high-performance connection of the online storage to an IBM Tivoli Storage Manager (TSM) environment. The performance of a system does not only depend on the hardware, but also on the architecture of the application. The scenario we are describing distributes its files over a large number of file servers, storing their data with the help of a proxy node to tape. Each file server can restore the data independent on the file server which has stored the data originally. Furthermore with the LAN free connection to the tape drives the data transfers bypass the TSM server which otherwise would be a bottleneck. The system is completely transparent to the user.
        Speaker: Dr Doris Ressmann (Forschungszentrum Karlsruhe)
        Paper
        Paper sources
        Slides
      • 81
        Cluster architecture for java web hosting at CERN
        Over the last years, we have experienced a growing demand for hosting java web applications. At the same time, it has been difficult to find an off-the-shelf solution that would enable load balancing, easy administration and a high level of isolation between applications hosted within a J2EE server. The architecture developed and used in production at CERN is based on a linux cluster. A piece of software developed at CERN, JPSManager, enables easy management of the service by following the self-management paradigm. JPSManager also enables quick recovery in case of hardware failure. The isolation between different clients of the service is implemented using multiple instances of Apache Tomcat, but the architecture is open and a different J2EE server can be incorporated if necessary. This paper describes this architecture in detail and analyses its advantages and limitations. Examples of HEP related applications, which make use of this architecture, are also given.
        Speaker: Michal Kwiatek (CERN)
        Paper
        Slides
      • 82
        dCache, the Upgrade
        For the last two years, the dCache/SRM Storage Element has been successfully integrated into the LCG framework and is in heavy production at several dozens of sites, spanning a range from single host installations up to those with some hundreds of tera bytes of disk space, delivering more than 50 TByes per day to clients. Based on the permanent feedback from our users and the detailed reports given by representatives of large dCache sites during our workshop at DESY end of August 2005, the dCache team has been identified important areas of improvement. With this presentation I would like to discuss some of those changes in more detail. This includes a more sophisticated handling of the various supported tape back-ends, the introduction of multiple I/O queues per pool with different properties to account for the divers behaviors of the different I/O protocols and the possibility to have one dCache instance spread over more than one physical site. Moreover I will touch on changes in the name-space management as short and long term perspective to keep up with future requirements. In terms of dissemination I will report on our initiative to make dCache a widely scalable storage element by introducing dCache, the Book, plans for improved packaging and more convenient source code license terms. Finally I would like to cover the dCache part of the german e-science project, d-Grid, which will allow for improved scheduling of tape to disk restore operations as well as advanced job scheduling by providing extended information exchange between storage elements and Job Scheduler.
        Speaker: Dr Patrick Fuhrmann (DESY)
        Slides
    • Distributed Data Analysis: DDA-1 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 83
        DIAL: Distributed Interactive Analysis of Large Datasets
        DIAL is a generic framework for distributed analysis. The heart of the system is a scheduler (also called analysis service) that receives high-level processing requests expressed in terms of an input dataset and a transformation to act on that dataset. The scheduler splits the dataset, applies the transformation to each subdataset to produce a new subdataset, and then merges these to produce the overall output dataset which is made available to the caller. DIAL defines a job interface that makes it possible for schedulers to connect with a wide range of batch and grid workload management systems. It also provides command line, root, python and web clients for job submission that enable users to submit and monitor jobs in a uniform manner. Scaling to very large jobs can be handled with a scheduler that does partial splitting and submits each subjob to another scheduler. I will give the current status of DIAL and discuss its use in the context of the ATLAS experiment at the CERN LHC (Large Hadron Collider). There we are looking at submission to local batch systems, globus gatekeepers, EGEE/LCG workload management, ATLAS production, and PANDA. The latter is a U.S. ATLAS framework for data production and distributed analysis (thus the name) that may also use DIAL for its internal scheduling.
        Speaker: David Adams (BNL)
        Slides
      • 84
        Distributed object monitoring for ROOT analyses with GO4 v3
        The new version 3 of the ROOT based GSI standard analysis framework GO4 (GSI Object Oriented Online Offline) has been released. GO4 provides multithreaded remote communication between analysis process and GUI process, a dynamically configurable analysis framework, and a Qt based GUI with embedded ROOT graphics. In the new version 3 a new internal object manager was developed. Its functionality was separated from the GUI implementation. This improves the GO4 GUI and browser functionality. Browsing and object monitoring from various local and remote data sources is provided by a user transparent proxy architecture. The GO4 communication mechanism between GUI and analysis processes was redesigned. Several distributed viewers may now connect to one analysis. Even a standard CINT session may initiate the Go4 communication environment to control an analysis process with the native ROOT browser. Similarly, standard analysis ROOT macros may be controlled by either a remote Go4 GUI or ROOT browser. Besides Linux, a lightweight binary GO4 v3 distribution (without Qt GUI) for MS WindowsXP is now available. Cross platform connections between GO4 environments are possible.
        Speaker: Dr Jörn Adamczewski (GSI)
        Paper
        Slides
      • 85
        DIRAC Production Manager Tools
        DIRAC is the LHCb Workload and Data Management system used for Monte Carlo production, data processing and distributed user analysis. Such a wide variety of applications requires a general approach to the tasks of job definition, configuration and management. In this paper, we present a suite of tools called a Production Console, which is a general framework for job formulation, configuration, replication and management. It is based on Object Oriented technology and designed for use by a collaboration of people with different roles and computing skills. An application is built from a series of simple building blocks – modules, steps and workflow. The Production Console also provides a GUI which is used to formulate a distributed application in the form of platform independent Python code. The system is written in C++ and based on Qt, ensuring portability for a variety of operating systems (currently Linux and Windows versions exist.)
        Speaker: Dr Gennady KUZNETSOV (Rutherford Appleton Laboratory, Didcot)
        Slides
      • 86
        PROOF - The Parallel ROOT Facility
        The Parallel ROOT Facility, PROOF, enables the interactive analysis of distributed data sets in a transparent way. It exploits the inherent parallelism in data of uncorrelated events via a multi-tier architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. Being part of the ROOT framework PROOF inherits the benefits of a performant object-oriented storage system and a wealth of statistical and visualization tools. Dedicated PROOF-enabled testbeds are now being deployed at CERN for testing by the LHC experiments. This paper describes the status of PROOF, focusing mainly on the latest developments: enriched API providing transparent browsing and drawing of data structures stored remotely, and full handling of the results of queries, with retrieve and archive functionalities; support for asynchronous (non-blocking) running mode, giving the possibility to submit a set of queries to be processed sequentially in the background; support for disconnect/reconnect-from-any-other-place functionality, allowing the user to temporary leave a session with running queries and reconnect later on to monitor the status and eventually retrieve the results; improved user interface with a powerful GUI allowing full control on the system and handling of the results; support for dynamic cluster configuration using self-discovery techniques to find out the available nodes; optimized response of the system in multi-user environments, like those foreseen in the forthcoming HEP experiments, with an abstract interface to the most common accounting systems. The ongoing developments to increase the robustness and fault tolerance of the system will also be discussed.
        Speaker: Gerardo GANIS (CERN)
        Paper
        Slides
    • Distributed Event production and Processing: DEPP-1 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 87
        The LCG Service Challenges - Results from the Throughput Tests and Service Deployment
        The LCG Service Challenges are aimed at achieving the goal of a production quality world-wide Grid that meets the requirements of the LHC experiments in terms of functionality and scale. This talk highlights the main goals of the Service Challenge programme, significant milestones as well as the key services that have been validated in production by the 4 LHC experiments. The LCG Service Challenge programme currently involves both the experiments as well as many sites, including the Tier0, all Tier1s as well as a number of key Tier2s, allowing all primary data flows to be demonstrated. The functionality so far achieved addresses all primary offline Use Cases of the experiments except for analysis, the latter being addressed in the final challenge - scheduled to run from April until September 2006 - prior to delivery of the full production Worldwide LHC Computing Service.
        Speaker: Dr Jamie Shiers (CERN)
      • 88
        Public Resource Computing at CERN - LHC@home
        Public resource computing uses the computing power of personal computers that belong to the general public. LHC@home is a public-resource computing project based on the BOINC (Berkeley Open Interface for Network Computing) platform. BOINC is an open source software system, developed by the team behind SETI@home, that provides the infrastructure to operate a public-resource computing project and run scientific applications in a distributed way. In LHC@home, the first public-resource computing application has been SixTrack, which simulates particles circulating around the Large Hadron Collider (LHC) ring in order to study the long-term stability of the particle orbits. Other high-energy physics applications are being prepared for LHC@home. Currently the system has about 8000 active users, 12000 active hosts and provides about 3 TFlops sustained processing rate. Motivating users is a very important part of this kind of project, and therefore LHC@home provides an attractive screen saver and a credit based ranking system for the users. Benefits and limitations of the public resource computing approach are explained and the results obtained with LHC@home are presented.
        Speaker: Dr Jukka Klem (Helsinki Institute of Physics HIP)
        Paper
        Slides
      • 89
        Massive data processing for the ATLAS Combined Test Beam
        In 2004, a full slice of the ATLAS detector was tested for 6 months in the H8 experimental area of the CERN SPS, in the so-called Combined Test Beam, with beams of muons, pions, electrons and photons in the range 1 to 350 GeV. Approximately 90 million events were collected, corresponding to a data volume of 4.5 terabytes. The importance of this exercise was two-fold: for the first time the whole ATLAS software suite was used on fully combined real data. Besides, a novel production infrastructure was employed for the reconstruction of the real data as well as for a massive production of simulated events. The talk will be focused on the Combined Test Beam production system. The system is comprised of two components for two distinct tasks: reconstruction of real data and production of simulated samples. Large scale real data reconstruction was performed twice in 2005 at CERN. In both cases a sizable sample of about 400 good runs, for a total of about 25 million events, were processed. Also in 2005, the Monte Carlo production was for the first time performed on the grid, with the simulation of about 4 Million events. Reprocessing of real data, as well as Monte Carlo production on the grid, are already planned for the year 2006.
        Speaker: Dr Frederik Orellana (Institute of Nuclear and Particle Physics, Université de Genève)
        Paper
        Slides
      • 90
        Database Access Patterns in ATLAS Computing Model
        In the ATLAS Computing Model widely distributed applications require access to terabytes of data stored in relational databases. In preparation for data taking, the ATLAS experiment at the LHC has run a series of large-scale computational exercises to test and validate multi-tier distributed data grid solutions under development. We present operational experience in ATLAS database services acquired in large-scale computations run on a federation of grids harnessing the power of more than twenty thousand processors. Among the lessons learned is the increase in fluctuations in database server workloads due to the chaotic nature of grid computations. The observed fluctuations in database access patterns are of a general nature and must be addressed through services enabling dynamic and flexibly managed provisioning of database resources. ATLAS is collaborating with the LCG 3D project and the OSG Edge Services Framework activity in the development of such services. ATLAS database services experience relevant to local CERN data taking operations is also presented including the conditions data flow of ATLAS Combined Test Beam operations, prototype Tier 0 scalability tests and event tag database operations.
        Speaker: A. Vaniachine (ANL)
        Slides
    • Event Processing Applications: EPA-1 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 91
        Geant4 in production: status and developments
        Geant4 has become an established tool, in production for the majority of LHC experiments during the past two years, and in use in many other HEP experiments and for applications in medical, space and other fields. Improvements and extensions to its capabilities continue, while its physics modeling are refined and results are accumulating for its validation for a variety uses. An overview of recent developments in diverse areas of the toolkit is discussed. These include developments that enable coupled propagation of charged particles in multiple geometries, improvements and refinements in EM physics and an overview of the evolution of the physics modelling. The progress in physics performance continues with a validation effort and physics comparisons with data in collaboration with different of experiments and user groups. We review briefly also the progress in the validation suites for Geant4 releases.
        Speaker: Dr John Apostolakis (CERN)
        Slides
      • 92
        Update On the Status of the FLUKA Monte Carlo Transport Code
        The FLUKA Monte Carlo transport code is a well-known simulation tool in High Energy Physics. FLUKA is a dynamic tool in the sense that it is being continually updated and improved by the authors. We review the progress achieved since the last CHEP Conference on the physics models, and some recent applications. From the point of view of hadronic physics, most of the effort is still in the field of nucleus--nucleus interactions with special emphasis on energies near threshold below 100 MeV/A. The currently available version of FLUKA already includes the internal capability to simulate inelastic nuclear interactions beginning with lab kinetic energies of 100 MeV/A up the highest accessible energies by means of the DPMJET-3 event generator to handle the interactions for >5 GeV/A and RQMD for energies below that down to ~100 MeV/A. The new developments concern, at high energy, the embedding of the DPMJET-III generator, which represent a major change with respect to the DPMJET-II structure. This will also allow the code to achieve a better consistency between the nucleus-nucleus section with the original FLUKA model for hadron-nucleus collisions. Work is also in progress to implement a third event generator model based on the Master Boltzmann Equation approach, in order to extend the energy capability from 100 MeV/A down to the threshold for these reactions. In addition to these extended physics capabilities, structural changes to the programs input and scoring capabilities are continually being upgraded. In particular we want to mention the upgrades in the geometry packages, now capable to reaching higher levels of abstraction. Work is also proceeding to provide direct import into ROOT of the FLUKA output files for analysis and to deploy a user-friendly GUI input interface. On the application front, FLUKA has been used to extensively evaluate the potential space radiation effects on astronauts for future deep space missions as well as being adapted for use in the simulation of events in the ALICE detector at the LHC.
        Speaker: Lawrence S. Pinsky (University of Houston)
        Paper
        Slides
      • 93
        GEANT4E: Error propagation for track reconstruction inside the GEANT4 framework
        GEANT4e is a package of the GEANT4 Toolkit that allows to propagate a track with its error parameters. It uses the standard GEANT4 code to propagate the track and for the track propagation it makes an helix approximation (with the step controlled by the user) using the same equations as GEANT3/GEANE. We present here a first working prototype of the GEANT4e package and compare its results and performance to the GEANE package.
        Speaker: Mr Pedro Arce (Cent.de Investigac.Energeticas Medioambientales y Tecnol. (CIEMAT))
        Paper
        Slides
      • 94
        Recent developments and upgrades to the Geant4 geometry modeler
        The Geometry modeler is a key component of the Geant4 tookit. It has been designed to exploit at the best the features provided by the Geant4 simulation toolkit, allowing the description in a natural way of the geometrical structure of complex detectors, from a few up to the hundreds of thousands of volumes of the LHC experiments, as well as human phantoms for medical applications or devices and spacescraft for simulations in the space environemnt. The established advanced techniques for optimizing tracking in the geometrical model have been recently enhanced and are currently under evolution to address additional use-cases. New geometrical shapes increased the rich set of primitives available, and new tools help users in the process of debugging her/his geometrical setup. The major concepts of the Geant4 geometry modeler will be reviewed, focussing on recent features introduced in the last releases of the Geant4 toolkit.
        Speaker: Dr Gabriele Cosmo (CERN)
        Paper
        Slides
      • 95
        Recent upgrates of the standard electromagnetic physics package in Geant4
        The current status and the recent developments of Geant4 "Standard" electromagnetic package are presented. The design iteration of the package carried out for the last two years is completed. It provides model versus process structure of the code. The internal database of elements and materials based on the NIST databases is introduced inside the Geant4 toolkit as well. The focus of recent activities is on upgrade of physics models and on validation of simulation results. The significant revisions were done for ionistion models, for models for transition radiation, and multiple scattering models, which are presented in this work. The acceptance suite evolution is also discussed.
        Speaker: Dr Michel Maire (LAPP)
        Paper
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-1 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 96
        The Open Science Grid
        We report on the status and plans for the Open Science Grid Consortium, an open, shared national distributed facility in the US which supports a multi-discplinary suite of science applications. More than fifty University and Laboratory groups, including 2 in Brazil and 3 in Asia, now have their resources and services accessible to OSG. 16 Virtual Organizations have registered their users to use the infrastructure. The US LHC experiments are depending on the Open Science Grid as the underlying facility in the US as part of the Worldwide LHC Computing Grid. The LIGO Scientific Collaboration, other astrophysics experiments, and the currently running particle physics experiments are actively engaged in moving their legacy systems to the common infrastructure to support their computing needs and extensions. We present our planned program of work to operate, mature and extend the capacities of the OSG. The activities proposed will sustain effective operation of the facility itself, increase the diversity in the applications supported, help new sites join the infrastructure and expand the scale of the fabric of computing and storage resources.
        Speakers: Frank Wuerthwein (UCSD for the OSG consortium), Ruth Pordes (Fermi National Accelerator Laboratory (FNAL)), Mrs Ruth Pordes (FERMILAB)
        Slides
      • 97
        The Integration Testbed of the Open Science Grid
        We describe the purpose, architectural definition, deployment and operational processes for the Integration Testbed (ITB) of the Open Science Grid (OSG). The ITB has been successfully used to integrate a set of functional interfaces and services required for the OSG Deployment. Activity leading to two major deployments of the OSG grid infrastructure. We discuss the methods and logical architecture used to couple resources and effort from participating VO's to build a common test infrastructure that could be used for validation by site administrators, Grid service providers, and application framework developers. Details of service performance, validation tests, and application performance using the deployed Grid infrastructure are also given.
        Speaker: Robert Gardner (University of Chicago)
      • 98
        The German HEP-Grid initiative
        The German Ministry for Education and Research announced a 100 million euro German e-science initiative focused on: Grid computing, e-learning and knowledge management. In a first phase started September 2005 the Ministry has made available 17 million euro for D-Grid, which currently comprises six research consortia: five community grids - HEP-Grid (high-energy physics), Astro-Grid(astronomy and astrophysics), Medi-Grid (medical/bioinformatics), C3-Grid(Collaborative Climate Community), In-Grid(engineering applications) - and an integration project, providing the horizontal platform for the vertical community grids. After an overview of the D-Grid initiative, we present the research program and first results of the HEP community grid. Though rather late compared to similar national grid initiatives, HEP-Grid enables us to learn from early developments and to look for gaps we want to focus on in three workpackages: datamanagement, automated user support and interactive analysis, all on the Grid.
        Speaker: Dr Peter Malzacher (Gesellschaft fuer Schwerionenforschung mbH (GSI))
        Slides
      • 99
        FermiGrid - Status and Plans
        FermiGrid is a cooperative project across the Fermilab Computing Division and its stakeholders which includes the following 4 key components: Centrally Managed & Supported Common Grid Services, Stakeholder Bilateral Interoperability, Development of OSG Interfaces for Fermilab and Exposure of the Permanent Storage System. The initial goals, current status and future plans for FermiGrid will be presented.
        Speaker: Dr Chadwick Keith (Fermilab)
        Slides
    • Online Computing: OC-1 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 100
        The ALICE Data-Acquisition Software Framework DATE V5
        The data-acquisition software framework DATE for the ALICE experiment at the LHC has evolved over a period of several years. The latest version DATE V5 is geared for deployment during the test and commissioning phase. The DATE software is designed to runs on several hundred machines being installed with Scientific Linux CERN (SLC) to handle the data streams of approximatly 400 optical Detector Data Links (DDLs) from the ALICE sub-detectors and to write full events onto transient/permanent data storage at a rate of up to 1.25 GB/s. DATE V5 consists of a collection of software packages that are responsible for the data flow and its formatting to carry out the readout, the event-building, and the recording. Additional software packages are in charge of the control, the system configuration, the status and error message reporting, the electronic logbook, the data quality and performance monitoring, and the memory management. The interfaces to the Experiment Control System (ECS) and to the High-Level Trigger (HLT) are implemented, whereas the interfaces to the Detector Control System (DCS) and to the Trigger System (TRG) are in design status. This paper will present the software architecture of DATE V5, the practical experience acquired at various detector integration setup, and future extensions.
        Speaker: Klaus SCHOSSMAIER (CERN)
        Paper
        Slides
      • 101
        CMS DAQ Event Builder based on Gigabit Ethernet
        The CMS Data Acquisition system is designed to build and filter events originating from approximately 500 data sources from the detector at a maximum Level 1 trigger rate of 100 kHz and with an aggregate throughput of 100 GByte/sec. For this purpose different architectures and switch technologies have been evaluated. Events will be built in two stages: the first stage, the FED Builder, will be based on Myrinet technology and will pre-assemble groups of about 8 data sources. The next stage, the Readout Builder, will perform the building of full events. In the baseline configuration the FED Builders merge events from 8 data sources and forward them to 8 independent Readout Builder slices, each made up of 64 Readout Units. The Readout Units send data to 64 Builder Units that build complete events and send them to PC farms responsible for the High Level Trigger selection. The finalization of the design of the Readout Builder is currently under study in view of the installation and commissioning of the FED Builder and the first slices of the Readout Builder foreseen in early 2007. In this paper we present the prospects of a Readout Builder based on TCP/IP over Gigabit Ethernet. Other Readout Builder architectures that we are considering are also discussed. The results of throughput measurements and scaling performances are outlined as well as the preliminary estimates of the final performances. All these studies have been carried out at our test-bed farms that are made up of a total of 130 dual Xeon PCs interconnected with Myrinet and Gigabit Ethernet networking and switching technologies.
        Speaker: Marco Pieri (University of California, San Diego, San Diego, California, USA)
        Paper
        Slides
      • 102
        The architecture and administration of the ATLAS online computing system
        The needs of ATLAS experiment at the upcoming LHC accelerator, CERN, in terms of data transmission rates and processing power require a large cluster of computers (of the order of thousands) administrated and exploited in a coherent and optimal manner. Requirements like stability, robustness and fast recovery in case of failure impose a server-client system architecture with servers distributed in a tree like structure and clients booted from the network. For security reasons, the system should be accessible only through an application gateway and, also to ensure the autonomy of the system, the network services should be provided internally by dedicated machines in synchronization with CERN IT department's central services. The paper describes a small scale implementation of the system architecture that fits the given requirements and constraints. Emphasis will be put on the mechanisms and tools used to net boot the clients via the "Boot With Me" project and to synchronize information within the cluster via the "Nile" tool.
        Speaker: Dr marc dobson (CERN)
        Paper
        Slides
      • 103
        The LHCb Online System
        LHCb is one of the four experiments currently under construction at Cern's LHC accelerator. It is a single arm spectrometer designed to study CP violation the B-meson system with high precision. This paper will describe the LHCb online system, which consists of three sub-systems: - The Timing and Fast Control (TFC) system, responsible for distributing the clock and trigger decisions together with beam-synchronous commands. The system is based on the Cern's RD-12 TTC and LHCb specific infrastructure - The Data Acquisition (DAQ) system which performs the transfer of the physics data from the front-end electronics to the storage via a large CPU farm housing sophisticated trigger software. The DAQ system is based on GbEthernet throughout from the Front-End Electronics to storage. The scale of the system (56 GB/s bandwidth) is very much comparable with the big LHC experiments, even though the number of detector channels and the event size is significantly smaller due to high readout rate of 1 MHz. - An integrated Experiment Control System (ECS) responsible for controlling and monitoring the operational state of the entire experimental setup, i.e. the detector hardware as well as all the electronics of the TFC and the DAQ system. The design of the system is guided by simplicity, i.e. identifying components with simple functionalities and connecting them together via simple protocols. The implementation decisions were based on the ideas of using COTS components and links wherever possible and commodity hardware where appropriate. In this paper we will present the design of the system and the status of its implementation and deployment.
        Speaker: Dr Beat Jost (CERN)
        Slides
    • Software Components and Libraries: SCL-1 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 104
        Common Application Software for the LHC Experiments
        The Applications Area of the LCG Project is concerned with developing, deploying and maintaining that part of the physics applications software and associated supporting infrastructure software that is common among the LHC experiments. This area is managed as a number of specific projects with well-defined policies for coordination between them and with the direct participation of the primary users of the software, the LHC experiments. It has been organized to focus on real experiment needs and special attention has been given to maintaining open information flow and decision making. The Applications Area has recently stated the second phase and an overview of the recent changes in the scope, objectives and organization is presented. In particular, the merge of the ROOT and SEAL projects into a single project. In addition, we describe the steps taken to facilitate the long-term support and maintenance of the software products that are being currently developed and that are essential for the operation of the LHC experiments.
        Speaker: Dr Pere Mato (CERN)
        Slides
      • 105
        ATLAS Distributed Database Services Client Library
        In preparation for data taking, the ATLAS experiment has run a series of large-scale computational exercises to test and validate distributed data grid solutions under development. ATLAS experience in prototypes and production systems of Data Challenges and Combined Test Team provided various database connectivity requirements for applications: connection management, online-offline uniformity, server indirection, etc. For example, the dynamics of ATLAS distributed database services requires a single point-of-control over server indirection - the logical-to-physical database server mapping, which is similar to the logical-to-physical mapping of file names on the grids. To address these requirements we developed, tested and deployed ATLAS database Client Library. In a heterogeneous distributed database services environment ATLAS database Client Library implements a consistent strategy for database server access and serves as a foundation layer for enforcing policies, following rules, establish best practices and encode logic to deliver efficient, secure and reliable database connectivity for applications. To provide scalable and robust applications access to databases the client library provides support for retries, failover, load balancing, etc. To hide the complexity of heterogeneous database technologies the library is separated in two layers. The outer layer provides management of database drivers, database connections and Connections/Servers lists. The extensible inner library layer is composed of a number of technology specific database drivers, currently supporting Oracle and MySQL. We present architecture of the Client Library services integration in ATLAS software framework Athena and use of these services by ATLAS major database applications – the Geometry HVS DB and Conditions IOV DB. We report on ATLAS Client Library integration through the ConnectionService module in the CORAL layer of the common LCG persistency project POOL.
        Speaker: Dr Alexandre Vaniachine (ANL)
        Slides
      • 106
        A Flexible, Distributed Event Level Metadata System for ATLAS
        The ATLAS experiment will deploy an event-level metadata system as a key component of support for data discovery, identification, selection, and retrieval in its multi-petabyte event store. ATLAS plans to use the LCG POOL collection infrastructure to implement this system, which must satisfy a wide range of use cases and must be usable in a widely distributed environment. The system requires flexibility because it is meant to be used at many processing levels by a broad spectrum of applications, including primary reconstruction, creation of physics-group-specific datasets, and event selection or data mining by ordinary physicists at production and personal scales. We use to our advantage the fact that LCG collections support file-based (specifically, ROOT TTree) and relational database implementations. By several measures, the event-level metadata system is the collaboration's most demanding relational database application. The ROOT trees provide a simple mechanism to encapsulate information during collection creation, and the relational tables provide a system for data mining and event selection over larger data volumes. ATLAS also uses the ROOT collections as local indexes when collocated with associated event data. Significant testing has been undertaken during the last year to validate that ATLAS can indeed support an event-level metadata system with a reasonable expectation of scalability. In this paper we discuss the status of the ATLAS event-level metadata system, and related infrastructure for collection building, extraction, and distributed replication.
        Speakers: Caitriana Nicholson (University of Glasgow), Caitriana Nicholson (Unknown), Dr David Malon (ARGONNE NATIONAL LABORATORY)
        Paper
        Slides
      • 107
        The Evolution of Databases in HEP - A Time-Traveler's Tale
        The past decade has been an era of sometimes tumultuous change in the area of Computing for High Energy Physics. This talk addresses the evolution of databases in HEP, starting from the LEP era and the visions presented during the CHEP 92 panel "Databases for High Energy Physics" (D. Baden, B. Linder, R. Mount, J. Shiers). It then reviews the rise and fall of Object Databases as a "one size fits all solution" in the mid to late 90's and finally summarises the more pragmatic approaches that are being taken in the final stages of preparation for LHC data taking. The various successes and failures (depending on one's viewpoint) regarding database deployment during this period are discussed, culminating in the current status of database deployment for the Worldwide LCG.
        Speaker: Dr Jamie Shiers (CERN)
    • Software Tools and Information Systems: STIS-1 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 108
        Prototyping an HEP Ontology Using Protégé
        Protégé is a free, open source ontology editor and knowledge-base framework developed at Stanford University (http://protege.stanford.edu/). The application is based on Java, is extensible, and provides a foundation for customized knowledge-based and Semantic Web applications. Protégé supports Frames, XML Schema, RDF(S), and OWL. It provides a "plug and play environment" that makes it a flexible base for rapid prototyping and application development. This paper will describe initial efforts and experience using Protégé to develop an HEP/Particle Physics ontology capable of supporting Semantic Web applications in the HEP domain.
        Speaker: Bebo White (STANFORD LINEAR ACCELERATOR CENTER (SLAC))
      • 109
        Extending FOAF with HEPNames Information for Use in HEP Semantic Web Applications
        The Semantic Web shows great potential in the HEP community as an aggregation mechanism for weakly structured data and a knowledge management tool for acquiring, accessing, and maintaining knowledge within experimental collaborations. FOAF (Friend-Of-A-Friend) (http://www.foaf-project.org/) is an RDFS/OWL ontology (some of the fundamental Semantic Web technologies) for expressing information about persons and their relationships. FOAF has become an active collaborative project that has evolved into a flexible and practically used ontology. The HEPNames database (http://www.slac.stanford.edu/spires/hepnames/about.shtml) is a comprehensive and widely used directory of individuals involved in HEP and related fields. The data in HEPNames is compiled from numerous sources, including laboratory directories and user supplied information. This paper will describe efforts for expanding the FOAF profile information with that data present in HEPNames thereby providing an expanded, machine-readable data format for HEP collaborator information that is understandable to HEP Semantic Web applications.
        Speaker: Bebo White (STANFORD LINEAR ACCELERATOR CENTER (SLAC))
      • 110
        Towards an Intelligent Agent Model Component agent Model
        The objective of the paper is to advance the research in component-based software development by including agent oriented software engineering techniques. Agent oriented Component-based software development is the next step after object-oriented programming that promises to overcome the problems, such as reusability and complexity that have not yet been solved adequately with object-oriented programming. Component-based software development addresses the complexity problem by producing software from smaller, typically black box components. These advantages have also been identified by the industry and hence a number of commercial component models have been developed. But with the introduction of intelligent agents in component, the model will be able to provide advanced tools and framework for the development of intelligent system. Although these component models have been developed in the past, but are not very useful when dealing with large Multi agent system application, simulation and modeling, as well as for distributed application. We try to provide a first step towards the right direction that is not only scratching the surface of software composition.
        Speaker: Mr Deepak Narasimha (VMRF Deemed University)
      • 111
        "Software kernels" - Can we gauge total application performance by inspecting the efficiency of compiled small (but important) software kernels?
        HEP programs commonly have very flat execution profiles, implying that the execution ime is spread over many routines/methods. Consequently, compiler optimization should be applied to the whole program and not just a few inner loops. In this talk I, nevertheless, discuss the value of extracting some of the most solicited routines (relatively speaking) and using them to gauge overall performance and potentially “critique” the code generated by the compiler(s). An initial set of ten C++ routines have been extracted from three HEP packages (ROOT, GEANT4 and CLHEP). One key advantage of this approach is that the selected routines compile and execute in seconds, allowing lots of testing of different platforms, compilers and compiler options. The speaker will review the initial selection and show results with GNU gcc and Intel icc on multiple hardware platforms.
        Speaker: Mr Sverre Jarp (CERN)
        Slides
    • 15:30
      Tea Break
    • Computing Facilities and Networking: CFN-2 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 112
        Chimera - a new, fast, extensible and Grid enabled namespace service
        After successfully deploying dCache over the last few years, the dCache team reevaluated the potential of using dCache for extremely huge and heavily used installations. We identified the filesystem namespace module as one of the components which would very likely need a redesign to cope with expected requirements in the medium term future. Having presented the initial design of Chimera during CHEP05 we are now able to provide the first fully-fledged prototype implementation, working with existing dCache systems. In addition to an improved performance profile, Chimera provides a wide set or enhanced functionalities. Being fully database oriented, standard SQL queries may be used for administration and monitoring activities instead of being bound to non optimized file-system commands. Moreover, user customized metadata can be attached to files and directories and may be used in SQL queries as well. Although chimera is coming with an nfs2/3 interface and an optimized door to the dCache core, other access protocols, e.g. web access, can be easily adapted using the standard API. Talking strict JDBC allows to run chimera against any database providing the standard JDBC drivers. We have been positively tested Postgres, MySQL and Oracle. Chimera has been designed and optimized for dCache interactions. Nevertheless Chimera is independent of the dCache software and may be used as filesystem namespace provider for other applications as well. To make this possible, all interactions from Chimera to external applications are realized as event callbacks and are with that freely customizable.
        Speaker: Mr Tigran Mkrtchyan Mkrtchyan (Deutsches Elektronen-Synchrotron DESY)
        Paper
        Slides
      • 113
        Quantifying the Digital Divide: A scientific overview of the connectivity of South Asian and African Countries
        The future of computing for HENP applications depends increasingly on how well the global community is connected. With South Asia and Africa accounting for about 36% of the world’s population, the issues of internet/network facilities are a major concern for these regions if they are to successfully partake in scientific endeavors. However, not only is the International bandwidth for these regions low, but also the internal network infrastructure is poor, rendering these regions hard to access for the global HENP community. In turn this makes collaborative research difficult and high performance grid activities essentially impractical. In this paper, we aim to classify the connectivity for academic and research institutions of these regions as a function of time, as seen from within, without and between the regions, and draw comparisons with more developed regions. The performance measurements are carried out using the PingER methodology; a lightweight approach using ICMP Ping packets. PingER has measurements to sites in over 110 countries that contain over 99% of the world’s Internet connected population and so is well-positioned to characterize the world’s connectivity. These measurements have been successfully used for quantifying, planning, setting expectations for connectivity and for identification of problems. The beneficiaries of this data range from international funding agencies and executive-level planners to network administrators.
        Speaker: Dr Roger Cottrell (Stanford Linear Accelerator Center)
        Paper
        Slides
      • 114
        UltraLight: An Ultrascale Information System for Data Intensive Research
        UltraLight is a collaboration of experimental physicists and network engineers whose purpose is to provide the network advances required to enable petabyte-scale analysis of globally distributed data. Current Grid-based infrastructures provide massive computing and storage resources, but are currently limited by their treatment of the network as an external, passive, and largely unmanaged resource. The goals of UltraLight are to: (1) Develop and deploy prototype global services which broaden existing Grid computing systems by promoting the network as an actively managed component, (2) Integrate and test UltraLight in Grid-based physics production and analysis systems currently under development in ATLAS and CMS, (3) Engineer and operate a trans- and intercontinental optical network testbed, including high-speed data caches and computing clusters, with U.S. nodes in California, Illinois, Florida, Michigan and Massachusetts, and overseas nodes in Europe, Asia and Latin America.
        Speaker: Richard Cavanaugh (University of Florida)
        Slides
      • 115
        Storage for the LHC Experiments
        Following on from the LHC experiments’ computing Technical Design Reports, HEPiX, with the agreement of the LCG, formed a Storage Task Force. This group was to: examine the current LHC experiment computing models; attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time; consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage; attempt to map the required storage capacities to suitable technologies; and formulate a plan to implement the required storage in a timely fashion. This group met for several months, and can now report on its findings.
        Speaker: Dr Roger JONES (LANCASTER UNIVERSITY)
        Slides
      • 116
        openlab-II: where are we, where are we going?
        The openlab, created three years ago at CERN, was a novel concept: to involve leading IT companies in the evaluation and the integration of cutting-edge technologies or services, focusing on potential solutions for the LCG. The novelty lay in the duration of the commitment (three years during which companies provided a mix of in-kind and in-cash contributions), the level of the contributions and more importantly the collaborative nature of the common undertaking which materialized in the construction of the opencluster. This phase, now called openlab-I, to which five major companies (Enterasys, HP, IBM, Intel and Oracle) contributed, is to be completed at the end of 2005, and a new phase, the openlab-II, is starting in 2006. This new three year period maintains the administrative framework of openlab-I (sufficient duration and level of contribution to permit the pursuit of ambitious goals) and will focus more on grid software, services and standardization. openlab- II comprises two related projects: PCC (the Platform Competence Centre Project) and GIC (the Grid Interoperability Centre Project). These activities can be characterized by a transversal theme: the application of virtualization to grids. The presentation will briefly draw lessons from openlab-I as a collaborative framework for R&D and will summarize the technical achievements. It will then describe the openlab-II projects, their goals, differences and commonalities, and will explain how they could contribute to the formalization of the “Virtualization for Grids” concept.
        Speaker: Mr Francois Fluckiger (CERN)
        Slides
    • Distributed Data Analysis: DDA-2 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 117
        Grid Data Management: Simulations of LCG 2008
        Simulations have been performed with the grid simulator OptorSim using the expected analysis patterns from the LHC experiments and a realistic model of the LCG at LHC startup, with thousands of user analysis jobs running at over a hundred grid sites. It is shown, first, that dynamic data replication plays a significant role in the overall analysis throughput in terms of optimising job throughput and reducing network usage; second, that simple file deletion algorithms such as LRU and LFU algorithms are as effective as economic models; third, that site policies which allow all experiments to share resources in a global Grid is more effective in terms of data access time and network usage; and lastly, that dynamic data management applied to user data access patterns where particular files are accessed more often (characterised by a Zipf power law function) lead to much improved performance compared to sequential access.
        Speaker: Caitriana Nicholson (University of Glasgow)
      • 118
        A skimming procedure to handle large datasets at CDF
        The CDF experiment has a new trigger which selects events depending on the significance of the track impact parameters. With this trigger a sample of events enriched of b and c mesons has been selected and it is used for several important physics analysis like the Bs mixing. The size of the dataset is of about 20 TBytes corresponding to an integrated luminosity of 1 fb-1 collected by CDF. CDF has developed a skimming procedure to reduce the dataset by selecting events which contain only B mesons in specifics decay modes. The rejected events are almost background, and this guarantees that no signal is lost while the processing time is reduced by factor 10. This procedure is based on SAM (Sequential Access via Metadata), the CDF data handling system. Each file from the original dataset is read via SAM and processed on the CDF users farm at Fermilab. The outputs are stored and cataloged via SAM on a temporary disk location at Fermilab in order to be finally concatenated. This final step consists of copy and then store and catalog the output in Italy on disks hosted at Tier 1, permanently. These skimmed data are available in Italy for the CDF collaboration, and user can access them via the Italian CDF farm. We will describe the procedure to skim data, concatenate the output and the method used to control that each input file is processed once and only once. The tool to copy data from the users farm to temporary and permanent disk locations, developed by CDF,consists of users authentication plus a transfer layer. Users allowed to perform the copy are mapped in a gridmap file and authenticated with a Globus Security Infrastructure (GSI). Details on the tool performances and the use and the definition of a remote permanent disk location will be described in detail.
        Speakers: Dr Donatella Lucchesi (INFN Padova), Dr Francesco Delli Paoli (INFN Padova)
      • 119
        Automated recovery of data-intensive jobs in D0 and CDF using SAM
        SAM is a data handling system that provides Fermilab HEP experiments of D0, CDF and MINOS with the means to catalog, distribute and track the usage of their collected and analyzed data. Annually, SAM serves petabytes of data to physics groups performing data analysis, data reconstruction and simulation at various computing centers across the world. Given the volume of the detector data, a typical physics analysis job consumes terabytes of information during several days of running at a job execution site. At any stage of that process, non systematic failures may occur, leaving a fraction of the original dataset unprocessed. To ensure convergence to completion of the computation request, a facility user has to employ a procedure to identify pieces of data that need to be re-analyzed in a manner that guarantees completeness without duplication in the final result. It is common that these issues are addressed by analyzing the output of the job. Such an approach is fragile, since it depends critically on the (changeable) output file format, and time-consuming. The approach that is reported in this article saves the user's time and ensures consistency in results. We present an automated method that uses SAM data handling to formalize distributed data analysis by defining a transaction based model of the physics analysis job work cycle to enable robust recovery of the unprocessed data.
        Speaker: Valeria Bartsch (FERMILAB / University College London)
        Paper
        Slides
      • 120
        Resource Predictors in HEP Applications
        The ATLAS experiment uses a tiered data Grid architecture that enables possibly overlapping subsets, or replicas, of original datasets to be located across the ATLAS collaboration. Many individual elements of these datasets can also be recreated locally from scratch based on a limited number of inputs. We envision a time when a user will want to determine which is more expedient, downloading a replica from a site or recreating it from scratch. To make this determination the user or his agent will need to understand the resources necessary both to recreate the dataset locally and to download any available replicas. We have previously characterized the behavior of ATLAS applications and developed the means to predict the resources necessary to recreate a dataset. This paper presents our efforts first to establish the relationship between various Internet bandwidth probes and observed file transfer performance, and then to implement a software tool that uses data transfer bandwidth predictions and execution time estimates to instantiate a dataset in the shortest time possible. We have found that file transfer history is a more useful bandwidth predictor than any instantaneous network probe. Using databases of application performace and file transfer history as predictors and using a toy model to distribute files and applications, we have tested our software tool on a number of simple Chimera-style DAG's and have realized time savings which are consistent with our expectations from the toy model.
        Speaker: John Huth (Harvard University)
      • 121
        CMS/ARDA activity within the CMS distributed computing system
        The ARDA project focuses in delivering analysis prototypes together with the LHC experiments. The ARDA/CMS activity delivered a fully-functional analysis prototype exposed to a pilot community of CMS users. The current integration work of key components into the CMS system is described: the activity focuses on providing a coherent monitor layer where information from diverse sources is aggregated and made available for the services supporting the user activity, providing a coherent global view of the CMS activity. Of particular interest is a high-level service controlling the job execution and implementing experiment-specific policies for efficient error recovery. This control layer is in the state of advanced prototyping and experience in developing, deploying and some users feedback is presented and discussed.
        Speaker: Dr Julia Andreeva (CERN)
    • Distributed Event production and Processing: DEPP-2 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 122
        The ATLAS Computing Model
        The ATLAS Computing Model is under continuous development. Previous exercises focussed on the Tier-0/Tier-1 interactions, with an emphasis on the resource implications and only a high-level view of the data and workflow. The work presented here attempts to describe in some detail the data and control flow from the High Level Trigger farms all the way through to the physics user. The current focus is on the use of TAG databases for access and the use of streaming at various levels to optimise the access patterns. There has also been detailed consideration of the required bandwidth to tape and disk, which then informs storage technology decisions. The modelling draws from the experience of previous and running experiments, but has also been tested in Data Challenges and will be tested in the LCG Service Challenge 4.
        Speaker: Dr Roger JONES (LANCASTER UNIVERSITY)
        Slides
      • 123
        ATLAS Experience on Large Scale Productions on the Grid
        The Large Hadron Collider at CERN will start data acquisition in 2007. The ATLAS (A Toroidal LHC ApparatuS) experiment is preparing for the data handling and analysis via a series of Data Challenges and production exercises to validate its computing model and to provide useful samples of data for detector and physics studies. DC1 was conducted during 2002-03; the main goals were to put in place the production infrastructure for a real worldwide collaborative effort and to gain experience in exercising an ATLAS wide production model. DC2 was run in the second half of 2004; the main goals were to test a new automated production system and to demonstrate that it could run in a coherent way on three different Grid flavours. DC2 was followed in the first half of 2005 by a new massive production of Monte Carlo data in order to provide the event samples for the ATLAS physics workshop in Rome in June 2005. We discuss the experience of the "Rome production" on the LHC Computing Grid infrastructure, describing its achievements, the improvements with respect to the previous Data Challenge and the actual problems observed. As a consequence of the observed shortcomings several improvements are being addressed in the ATLAS LCG/EGEE taskforce. Its activity ranges from testing new developments in workload management system (bulk submission), the integration the VOMS based model of authorization in the production environment and on the deployment of the ATLAS data management in the LCG infrastructure.
        Speaker: Dr Gilbert Poulard Poulard (CERN)
        Paper
        Slides
      • 124
        Studies with the ATLAS Trigger and Data Acquisition "pre-series" setup
        The ATLAS experiment at LHC will start taking data in 2007. As preparative work, a full vertical slice of the final higher level trigger and data acquisition (TDAQ) chain, "the pre-series", has been installed in the ATLAS experimental zone. In the pre-series setup, detector data are received by the readout system and next partially analyzed by the second level trigger (LVL2). On acceptance by LVL2 all data are passed through the event building (EB) and the event filter (EF) farms; selected events are written to mass storage. The pre-series setup was used to validate the technology and implementation choices by comparing the final ATLAS readout requirements, to the results of performance, functionality and stability studies. These results were also used to validate the simulations of the components and subsequently to model the full size ATLAS system. The model was used to further confirm the ability of the final system to meet the requirements and to obtain indications on event building rate, latencies of the various stages, buffer occupancies of the network switches, etc. This note summarizes these studies together with other optimization investigations such as number of application instances per CPU and choice of network protocols. For the first time, realistic LVL2 and EF algorithms were utilized in such a large and realistic test-bed. Continuous deployment and test will take place during the assembly of the full ATLAS TDAQ system. The interfacing of the pre-series with one of the sub-detectors has also been successfully tested in ATLAS experimental zone. We show that all the components which are not running reconstruction algorithms match the final ATLAS requirements. For the others, we calculate the amount of time per event that could be allocated to run these not yet finalized algorithms. Based on the calculations, we estimate the computing power necessary for using the present implementation of the ATLAS reconstruction software.
        Speaker: Dr gokhan unel (UNIVERSITY OF CALIFORNIA AT IRVINE AND CERN)
        Paper
        Slides
      • 125
        The Use and Integration of Distributed and Object-Based File-Systems at Brookhaven National Laboratory
        The roles of centralized and distributed storage at the RHIC/USATLAS Computing Facility have been undergoing a redefinition as the size and demands of computing resources continues to expand. Traditional NFS solutions, while simple to deploy and maintain, are marred by performance and scalability issues, whereas distributed software solutions such as PROOF and rootd are application specific, non-posix compliant, and do not present a unified namespace. Hardware and software-based storage offer differing philosophies with respect to administration, data access, and how I/O bottlenecks are resolved. Panasas, a clustered, load-balanced storage appliance utilizing an object-based file system, has been key in mitigating the problems inherent in NFS centralized storage. Conversely, distributed software storage implementations such as dcache and xrootd have enabled individual compute nodes to actively participate as a unified “file server”, thus allowing one to reap the benefits of inexpensive hardware without sacrificing performance. This talk will focus on the architecture of these file servers, how they are being utilized, and the specific issues each attempt to address. Furthermore, testing methodologies and expectations will be discussed as they pertain to the evaluation of new file servers.
        Speaker: Robert Petkus (Brookhaven National Laboratory)
        Paper
        Slides
      • 126
        OSG-CAF - A single point of submission for CDF to the Open Science Grid
        The increasing instantaneous luminosity of the Tevatron collider will cause the computing requirements for data analysis and MC production to grow larger than the dedicated CPU resources that will be available. In order to meet future demands, CDF is investing in shared, Grid, resources. A significant fraction of opportunistic Grid resources will be available to CDF before the LHC era starts and CDF could greatly benefit from using them. CDF is therefore reorganizing its computing model to be integrated with the new Grid model. In the case of Open Science Grid (OSG), CDF has extended its CDF Analysis Farm (CAF) infrastructure by using Condor glide-in and Generic Connection Brokering (GCB) to produce a CDF portal to the OSG that has an identical user interface to the CAF infrastructure used for submissions to the CDF dedicated resources, including its semi-interactive monitoring tools. This talk presents the architecture of the OSG-CAF and its current state-of-the-art implementation. We also present the issues we have found in deploying the system, as well as the solution we adopted to overcome them. Finally, we show our early prototype which uses the OSG opportunistic workload management system and Edge Services Framework to harvest the opportunistically schedulable resources on the OSG in ways that are transparent to the CDF user community.
        Speaker: Matthew Norman (University of California at San Diego)
        Paper
        Slides
      • 127
        GridX1: A Canadian Particle Physics Grid
        GridX1, a Canadian computational Grid, combines the resources of various Canadian research institutes and universities through the Globus Toolkit and the CondorG resource broker (RB). It has been successfully used to run ATLAS and BaBar simulation applications. GridX1 is interfaced to LCG through a RB at the TRIUMF Laboratory (Vancouver), which is an LCG computing element, and ATLAS jobs are routed to Canadian resources. Recently, the BaBar application has also been implemented to run jobs through GridX1, concurrently with ATLAS jobs. Two independent RBs are being used to submit ATLAS and BaBar jobs for an efficient operation of the grid. The status of grid jobs and the resources are monitored using a web-based monitoring system.
        Speaker: Dr Ashok Agarwal (Department of Physics and Astronomy, University of Victoria, Victoria, Canada)
        Paper
        Slides
    • Event Processing Applications: EPA-2 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 128
        HepData and JetWeb: HEP Data Archiving and Model Validation
        Accurate modelling of hadron interactions is essential for the precision analysis of data from the LHC. It is therefore imperative that the predictions of Monte Carlos used to model this physics are tested against relevant existing and future measurements. These measurements cover a wide variety of reactions, experimental observables and kinematic regions. To make this process more reliable and easily verifiable, the CEDAR collaboration is developing a set of tools for tuning and validating models of such interactions based on the existing JetWeb automatic event generation system and the HepData data archive. We describe the work that has been done on the migration to a MySQL relational database of the already long established Durham HepData database, which contains an extensive archive of data cross sections. The new user web-based front end is described. We also discuss plans for direct experiment data entry and verification, and the status of the JetWeb system and its direct interface to HepData allowing concurrent validation over as wide a range of measurements as possible.
        Speakers: Dr Andy Buckley (Durham University), Andy Buckley (University of Cambridge)
        Paper
        Slides
      • 129
        Geant4 Acceptance Suite for Key Observables
        The complexity of the Geant4 code requires careful testing of all of its components, especially before major releases. In this talk, we will concentrate on the recent development of an automatic suite for testing hadronic physics in high energy calorimetry applications. The idea is to use a simplified set of hadronic calorimeters, with different beam particle types, and various beam energies, and comparing relevant observables between a given reference version of Geant4 and the new candidate one. Only those distributions that are statistically incompatible are then printed out and finally inspected by a person to look for possible bugs. The suite is made of Python scripts, and utilizes the "Statistical Toolkit" for the statistical tests between pair of distributions, and runs on the Grid to cope with the large amount of CPU needed in a short period of time.
        Speaker: Dr Alberto Ribon (CERN)
        Paper
        Slides
      • 130
        Systematic validation of Geant4 electromagnetic and hadronic models against proton data
        A project is in progress for a systematic, rigorous, quantitative validation of all Geant4 physics models against experimental data, to be collected in a Geant4 Physics Book. Due to the complexity of Geant4 hadronic physics, the validation of Geant4 hadronic models proceeds according to a bottom-up approach (i.e. from the lower energy range up to higher energies): this approach allows establishin the accuracy of individual Geant4 models specific to a given energy range on top of already validated models pertinent to a lower energy. Results are presented concerning the lower hadronic interaction phases, involving nuclear de-excitation and pre-equilibrium (up to 100 MeV). All Geant4 electromagnetic and hadronic physics models, and pre-packaged physics configurations distributed by the Geant4 Collaboration (PhysicsLists) relevant to this energy range have been included in the validation test. The electromagnetic models are Standard, LowEnergy-ICRU, LowEnergy-Ziegler (Ziegler-1977, Ziegler-1985, Ziegler-2000). The hadronic models for inelastic scattering involve Nuclear De-excitation in two variants (default and GEM), Precompound (with and without Fermi break-up), Bertini and Binary Cascade, and parameterised models. The models for elastic scattering under test are the parameterised one and the newly developed Bertini Elastic. Various prepackaged PhysicsLists are also subject to the same validation process. The validation of Geant4 physics models is performed against experimental data measured with 2% accuracy. The quantitative comparison of simulated and experimental data distributions is performed through a sophisticated goodness-of-fit statistical analysis, including Anderson-Darling and Cramer-von Mises tests. Please note that the speaker name is preliminary; the actual speaker among the authors will be communicated later.
        Speakers: Dr Aatos Heikkinen (HIP), Dr Barbara Mascialino (INFN Genova), Dr Francesco Di Rosa (INFN LNS), Dr Giacomo Cuttone (INFN LNS), Dr Giorgio Russo (INFN LNS), Dr Giuseppe Antonio Pablo Cirrone (INFN LNS), Dr Maria Grazia Pia (INFN GENOVA), Dr Susanna Guatelli (INFN Genova)
        Slides
      • 131
        Simulation for LHC radiation background: optimisation of monitoring detectors and experimental validation
        Monitoring radiation background is a crucial task for the operation of LHC experiments. A project is in progress at CERN for the optimisation of the radiation monitors for LHC experiments. A general, flexibly configurable simulation system based on Geant4, designed to assist the engineering optimisation of LHC radiation monitor detectors, is presented. Various detector packaging configurations are studied through their Geant4-based simulation, and their behaviour is compared. A quantitative validation of Geant4 electromagnetic and hadronic models relevant to LHC radiation background monitoring is presented; the results are based on rigorous statistical methods applied to the comparison of Geant4 simulatin results and experimental data from dedicated test-beams.
        Speakers: Dr Barbara Mascialino (INFN Genova), Dr Federico Ravotti (CERN), Dr Maria Grazia Pia (INFN GENOVA), Dr Maurice Glaser (CERN), Dr Michael Moll (CERN), Dr Riccardo Capra (INFN Genova)
        Slides
      • 132
        Simulation of heavy ion therapy system using Geant4
        Geant4 is a toolkit to simulate the passage of a particle through matter based on Monte Carlo method. Geant4 incorporates many of available experimental data and theoretical models over wide energy region, extending its application scope not only to high energy physics but also medical physics, astro-physics, etc. We have developed a simulation framework for heavy ion therapy system based on Geant4 to enable detailed treatment planning. Heavy ion beam features high RBE (relative biological effectiveness) and intensive dose given at certain depth (Bragg peak), allowing patients to suppress unwanted exposure on normal tissue. Pioneering trials of heavy ion therapy carried out in a few countries proved its availability, triggering many projects to construct new heavy ion therapy facilities around the world. However, reaction of heavy ions on material involves many complex processes as compared to X-ray or electron beam used in traditional radiation therapy, and the development of a new reliable simulator is essential to determine the beam intensity, energy, size of radiation field, and so on required for each case of treatment. Geant4 is a suitable tool for this purpose as a generalized simulator with powerful capability to describe complicated geometry. We implemented the heavy ion beam lines of several facilities in Geant4, including dedicated apparatus, and tested the Geant4 physics processes in comparison with experimental data. We will introduce the simualtion framework and present the validation results.
        Speaker: Dr Satoru Kameoka (High Energy Accelerator Research Organisation)
      • 133
        The ATLAS Detector Simulation -an LHC challenge
        The simulation program for the ATLAS experiment at CERN is currently in a full operational mode and integrated into the ATLAS’s common analysis framework, ATHENA. The OO approach, based on GEANT4, and in use during the DC2 data challenge has been interfaced within ATHENA and to GEANT4 using the LCG dictionaries and Python scripting. The robustness of the application was proved during the DC2 data challenge. The Python interface has added the flexibility, modularity and interactivity that the simulation tool needs to tackle, in a common way, different full ATLAS simulations setups, test beams and cosmic ray studies. Generation, simulation and digitization steps were exercised for performance and robustness tests. The comparison with real data has been possible in the context of the ATLAS Combined Test Beam (2004) and ongoing cosmic ray studies.
        Speaker: Prof. Adele Rimoldi (University of Pavia)
        Paper
        Slides
      • 134
        EvtGen project in ATLAS
        The project “EvtGen in ATLAS” has the aim of accommodating EvtGen into the LHC-ATLAS context. As such it comprises both physics and software aspects of the development. ATLAS has developed interfaces to enable the use of EvtGen within the experiment's object-oriented simulation and data-handling framework ATHENA, and furthermore has enabled the running of the software on the LCG. Modifications have been made to meet the requirements of simulating beauty events centrally produced in proton proton collisions, with which ATLAS is primarily concerned. Here we review the programme of work including both software and physics related activities. The results of validation simulations are shown, and future plans are discussed.
        Speaker: Roger Jones (Lancaster University)
        Paper
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-2 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 135
        GRIF, a Tier2 center for Paris region
        Several HENP laboratories in Paris region have joined together to provide an LCG/EGEE Tier2 center. This resource, called GRIF, will focus on LCG experiments but will also be opened to EGEE users from other disciplines and to local users. It will provide resources for both analysis and simulation and offer a large storage space (350 TB planned by end of 2007). This Tier2 will have resources distribued on several sites in Paris region. Our effort in the preliminary phase is focused on the best architecture to make an efficient and flexible use of such a distributed resources, in particular regarding storage. This talk will present the configuration choices evaluated and the preliminary conclusions regarding both computing resources and storage and the tools used to provide a consistent configuration and a centralized monitoring of the whole resource.
        Speaker: Mr Michel Jouvin (LAL / IN2P3)
        Slides
      • 136
        Grid Computing Research---Its influence on economic and scientific progress for Pakistan
        We present a report on Grid activities in Pakistan over the last three years and conclude that there is significant technical and economic activity due to the participation in Grid research and development. We started collaboration with participation in the CMS software development group at CERN and Caltech in 2001. This has led to the current setup for CMS production and the LCG Grid deployment in Pakistan. Our research group has been participating actively in the development work of PPDG and OSG, and is now working in close collaboration with Caltech to create the next generation infrastructure for data intensive science under the Interactive Grid Enabled Environment (IGAE) project within a broader context of Ultralight collaboration. This collaboration is based on a partnership with the University of the West of England (UWE) UK under the EU funded Asia Link program and Caltech under an exchange program from the US state Department. The collaboration on Grid Monitoring and Digital divide activities with Caltech using the MonALISA monitoring framework and with SLAC on Maggie has not only helped to train our faculty and students but it has also helped to improve the infrastructure, bandwidth and computing capabilities in Pakistan. The collaboration extends to Wuhan University and BIT china, Comtec Japan and Kyong Hee University Korea on human resource and expertise development programmes in Grid and related areas. The dedicated team of researchers (faculty and students) not only participates in international research activities but also supports industry and advising the government to address the digital divide issues, and internet information access to its population. Our initiative proved to be a catalyst for other universities in Pakistan with a growing interest in all national universities to collaborate in Grid activities. This activity strengthened the learning abilities and concepts of our students who have become the first graduate student choice by universities of international repute. Discussions have already been started about establishment of a National Grid in Pakistan and the means to mobilize resources in basic sciences, government departments and the business community.
        Speaker: Prof. Arshad Ali (National University of Sciences & Technology (NUST) Pakistan)
        Paper
      • 137
        Grid Operations: the evolution of operational model over the first year
        The paper reports on the evolution of operational model which was set up in the "Enabling Grids for E-sciencE" (EGEE) project, and on the implications of Grid Operations in LHC Computing Grid (LCG). The primary tasks of Grid Operations cover monitoring of resources and services, notification of failures to the relevant contacts and problem tracking through a ticketing system. Moreover, an escalation procedure is enforced to urge the responsible bodies to address and solve the problems. An extensive amount of knowledge has been collected, documented and published in a way which facilitates a rapid resolution to the common problems. Initially, the daily operations were performed by only one person at CERN, but the task soon required setting up a small team. The number of sites in production quickly expanded from 60 to 170 in less than a year. The expansion of EGEE/LCG infrastructure has led to distributed workload which involves more and more geographically scattered teams. The evolution of both procedures and workflow requires steady refinement of the tools which consist of the ticketing system, knowledge database and integration platform and which are used for monitoring and operations management. Since EGEE/LCG production infrastructure relies on the availability of robust operations mechanisms, it is essential to gradually improve the operational procedures and to track the progress of the tools' on-going development.
        Speakers: Mr Gilles Mathieu (IN2P3, Lyon), Ms Helene Cordier (IN2P3, Lyon), Mr Piotr Nyczyk (CERN)
        Paper
        Slides
      • 138
        Global Grid User Support: the model and experience in the Worldwide LHC Computing Grid
        The organization and management of the user support in a global e-science computing infrastructure such as the Worldwide LHC Computing Grid (WLCG) is one of the challenges of the grid. Given the widely distributed nature of the organization, and the spread of expertise for installing, configuring, managing and troubleshooting the grid middleware services, a standard centralized model could not be deployed in WLCG. We therefore have a central coordination model with distributed expertise and support, able to provide solutions for thousands of grid users and administrators with capacity for up to thousands requests per week. Problem reports can be submitted either through a central portal or using the regional support centers. The central infrastructure has been interfaced to the regional units to allow requests to flow in both directions from centre to region and vice versa. A first line support team provides support for generic grid problems while specialized units answer to middleware, deployment, network, other grid infrastructures and virtual organization specific problems. Furthermore, regional centers provide support for local site problems. Whenever the expertise is missing at a regional center, the problem can be forwarded to the central system for solving or forwarding to an appropriate specialist. The system plays a great role in daily operations support and therefore it is interfaced to the grid Core Infrastructure Center (CIC) for grid monitoring and specific virtual organization information. The central portal provides a set of useful services such as a collection of up to date and useful documents, a facility for problem report browsing, a powerful search engine, e-mail access, and hot-line service. Tutorials for users and supporters are organized regularly by the support training service. In this paper we describe the model and general infrastructure of the Global Grid User Support and provide results from experience with statistics on the operation of the service.
        Speakers: Dr Flavia Donno (CERN), Dr Marco Verlato (INFN Padova)
        Paper
        Slides
      • 139
        Gridview : A Grid Monitoring and Visualization Tool
        The LHC Computing Grid (LCG) connects together hundreds of sites consisting of thousands of components such as computing resources, storage resources, network infrastructure and so on. Various Grid Operation Centres (GOCs) and Regional Operations Centres (ROCs) are setup to monitor the status and operations of the grid. This paper describes Gridview, a Grid Monitoring and Visualization Tool being developed for use primarily at GOCs and ROCs. It can also be used by Site Administrators and Network Administrators at various sites to view metrics for their site and by the VO Administrators to get a brief of resource availability/usage for their virtual organizations. The objective of this tool is to fetch grid status information and fault data from different sensors and monitoring tools at various sites, archive it into a central database, analyze, summarize it and display it in a graphical form. It is intended to serve as a dash-board (central interface) for status and fault information of the entire grid. The tool is based on the concept of loosely coupled components with independent sensors, transport, archival, analysis and visualization components. The sensors can be LCG information providers or any other monitoring tools, the transport mechanism used is Relational Grid Monitoring Architecture (R-GMA), Gridview provides the central archival, analysis and visualization functionality. The architecture of the tool is very flexible and new data sources can be easily added in the system. The first version of Gridview is deployed and was used extensively for online monitoring of data transfers among grid sites during LCG Service Challenge 3 (SC3) throughput tests. The paper discusses the architecture, current implementation and future enhancements to this tool. It summarizes the architectural and functional requirements of a monitoring tool for the grid infrastructure.
        Speaker: Mr Rajesh Kalmady (Bhabha Atomic Research Centre)
        Paper
        Slides
      • 140
        GridICE: Requirements, Architecture and Experience of a Monitoring Tool for Grid Systems
        The Grid paradigm enables the coordination and sharing of a large number of geographically-dispersed heterogeneous resources that are contributed by different institutions. These resources are organized into virtual pools and assigned to group of users. The monitoring of such a distributed and dynamic system raises a number of issues like the need for dealing with administrative boundaries, the heterogeneity of resources, the different types of monitoring information consumers and the various levels of abstraction. In this paper, we describe GridICE, a Grid monitoring system designed to meet the above requirements and for an easy integration with local monitoring systems. It promotes the adoption of de-facto standard Grid Information Service interfaces, protocols and data models. Further, different aggregations and partitions of monitoring data are provided based on the specific needs of different users categories. Being able to start from summary views and to drill down to details, it is possible to verify the composition of virtual pools or to sketch the sources of problems. A complete history of monitoring data is also maintained to deal with the need for retrospective analysis. In this paper, we offer the details of the requirements that have driven the design of GridICE and we describe the current architecture and implementation. An important part will be devoted to the description of the result of a 3-year experience in the LHC Computing Grid production environment and we highlight how GridICE can be used for supporting VO users, operations and disseminations activities.
        Speaker: Mr Sergio Andreozzi (INFN-CNAF)
        Paper
        Slides
    • Online Computing: OC-2 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 141
        Unified Software Framework for Upgraded Belle DAQ System
        The Belle experiment, which is a B-factory experiment at KEK in Japan, is currently taking data with a DAQ system based on FASTBUS readout, switchless event building and higher level trigger(HLT) farm. To cope with a higher trigger rate from the expected sizeable increase in the accelerator luminosity in coming years, the upgrade of the DAQ system is in progress. FASTBUS modules are being replaced with newly-developed pipelined readout modules equipped with Linux-operated CPUs, and additional units of the modularized event builder and HLT farm are beingadded. We developed a unified software framework for the upgraded DAQ system which can be used at all levels from the readout modules to the HLT farms. The software is modularized and consists of following components: a common data processing framework compatible with the offline analysis, point-to-point data transmitter and receiver programs over TCP connections, a ring buffer, an event building module, and a slow control framework. The advantage to have a unified framework is that a software module developed in offline can be directly executed at any levels of DAQ, even at the readout module. This makes the development of the DAQ software much easier. The experience with the unified framework in the partially-upgraded Belle DAQ system is presented.
        Speaker: Prof. Ryosuke ITOH (KEK)
        Slides
      • 142
        FutureDAQ for CBM: Online Event Selection
        At the upcoming new Facility for Antiproton and Ion Research FAIR at GSI the Compressed Baryonic Matter experiment CBM requires a new architecture of front-end electronics, data acquisition, and event processing. The detector systems of CBM are a Silicon Tracker System, RICH detectors, a TRD, RPCs, and an electromagnetic calorimeter. The envisioned interaction rate of 10~MHz produces a data rate of up to 1 TByte/s. Because of the complexity and variability of trigger decisions no common trigger will be applied. Instead, the front-end electronics of all detectors will be self-triggered and marked by time stamps. The full data rate must be switched through a high speed network fabric into a computational network with configurable processing resources for event building and filtering. The decision for selecting candidate events requires tracking, primary vertex reconstruction, and secondary vertex finding in the STS at the full interaction rate. The essential performance factor is now computational throughput rather than decision latency, which results in a much better utilization of the processing resources especially in the case of heavy ion collisions with strongly varying multiplicities. The development of key components is supported by the FutureDAQ project of the European Union (FP6 I3HP JRA1). The design and first simulation results of such a DAQ system are presented.
        Speaker: Dr Hans G. Essel (GSI)
        Paper
        Slides
      • 143
        Design and Performance of the CDF Experiment Online Control and Configuration Systems
        The CDF Experiment's control and configuration system consists of several database applications and supportive application interfaces in both Java and C++. The CDF Oracle database server runson a SunOS platform and provide both configuration data, real-time monitoring information and historical run conditions archiving. The Java applications running on the Scientific Linux operating system implement novel approaches to mapping a relational database onto an object oriented language, while maintaining efficiency for modifying run configurations in a rapidly changing physics environment. Configuration and conditions data are propagated in real time from the central online system to multiple remote database sites in an effort to provide grid-like support for offline end-user analysis applications. We review details of the design, successes and pitfalls of a complex interoperative configuration system.
        Speaker: Dr William Badgett (Fermilab)
        Slides
      • 144
        Architecture and implementation of the ALICE Data-Acquisition System
        ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). A large bandwidth and flexible Data Acquisition System (DAQ) is required to collect sufficient statistics in the short running time available per year for heavy ion and to accommodate very different requirements originated from the large set of detectors and the different beams used. The DAQ system has been designed, implemented, and intensively tested. It has reached maturity and is being installed at the experimental area for tests and commissioning of detectors. It is heavily based on commodity hardware and open-source software but it also includes specific devices for custom needs. The interaction of thousands of DAQ entities turns out to be the core of this challenging project. We will present the overall ALICE data-acquisition architecture, showing how the data flow is handled from the front-end electronics to the permanent data storage. Then some implementation choices (PCs, networks, databases) will be discussed, in particular the usage of tools for controlling and synchronizing the elements of this diversified environment. Practical aspects of deployment and infrastructure running will be covered as well, including performance tests achieved so far.
        Speaker: Mr Sylvain Chapeland (CERN)
        Paper
        Slides
    • Software Components and Libraries: SCL-2 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 145
        Recent Developments in the ROOT I/O and TTrees
        Since version 4.01/03, we have continued to strenghten and improve the ROOT I/O system. In particular we extended and optimized support for all STL collections, including adding support for member-wise streaming. The handling of TTree objects was also improved by adding support for indexing of chains, for using bitmap algorithm to speed up search, and for accessing an sql table through the TTree interface. We also introduced several new convenient interfaces in the I/O and TTree to help to reduce coding errors. We will describe in details these new features and their implementation.
        Speaker: Mr Philippe Canal (FERMILAB)
        Slides
      • 146
        CORAL, a software system for vendor-neutral access to relational databases
        The COmmon Relational Abstraction Layer (CORAL) is a C++ software system,developed within the context of the LCG persistency framework, which provides vendor-neutral software access to relational databases with defined semantics. The SQL-free public interfaces ensure the encapsulation of all the differences that one may find among the various RDBMS flavours in terms of SQL syntax and data types. CORAL has been developed following a component architecture where the various RDBMS-specific implementations of the interfaces are loaded as plugin libraries at run-time whenever required. The system addresses the needs related to the distributed deployment of relational data by providing hooks for client-side monitoring, database service indirection and application-level connection pooling.
        Speaker: Dr Ioannis Papadopoulos (CERN, IT Department, Geneva 23, CH-1211, Switzerland)
        Paper
        Slides
      • 147
        COOL Development and Deployment - Status and Plans
        Since October 2004, the LCG Conditions Database Project has focused on the development of COOL, a new software product for the handling of experiment conditions data. COOL merges and extends the functionalities of the two previous software implementations developed in the context of the LCG common project, which were based on Oracle and MySQL. COOL is designed to minimise the duplication of effort by developing a single implementation to support persistency for several relational technologies (Oracle, MySQL and SQLite), based on the POOL Relational Abstraction Layer (RAL) and on the SEAL libraries. The same user code may be used to store data into any one of these backends, as COOL functionalities are encapsulated by a technology-neutral C++ API. After several production releases of the COOL software, the project is now moving into the deployment phase in Atlas and LHCb, the two experiments that are developing the software in collaboration with CERN IT.
        Speaker: Dr Andrea Valassi (CERN)
        Paper
        Slides
      • 148
        Using multiple persistent technologies in the Condition/DB of BaBar
        The data production and analysis system of the BaBar Experiment has evolved through a series of changes from a day when the first data were taken in May 1999. The changes, in particular, have also involved persistent technologies used to store the event data as well as a number of related databases. This talk is about CDB - the distributed Conditions Database of the BaBar Experiment. The current production version of the database was deployed in 2002. One of the principles behind the design of CDB was its ability to deal with multiple persistent technologies to store the data. Originally, CDB was implemented using Objectivity/DB - the commercial OODB. Two new implementations of CDB, based on ROOT I/O and MySQL, are now available. All three are going to coexist and be used in the Experiment for a while, targeting various groups of users. This problem poses rather interesting challenges in managing this hybrid system in a highly distributed environment of the Experiment. The talk will cover key design decisions in a foundation of the database, its flexible API designed to cope with multiple persistent technologies, an information flow (transformation) between different formats of data, and other non-trivial aspects of managing this complex system.
        Speaker: Dr Douglas Smith (STANFORD LINEAR ACCELERATOR CENTER)
        Slides
      • 149
        The LHCb Conditions Database Framework
        The LHCb Conditions Database (CondDB) project aims to provide the necessary tools to handle non-event time-varying data. The LCG project COOL provides a generic API to handle this type of data and an interface to it has been integrated into the LHCb framework Gaudi. The interface is based on the Persistency Service infrastructure of Gaudi, allowing the user to load it at run-time only if needed. Since condition data are varying with time, as the events are processed, condition objects in memory must be kept synchronized to the values in the database for the current event time. A specialized service has been developed independently of the COOL API interface to provide an automated and optimized update of the condition objects in memory. The High Level Trigger of LHCb is a specialized version of an LHCb reconstruction/analysis program and as such it will need conditions, like alignments and calibrations, from the conditions database. For performance reasons, the HLT processes running on the Event Filter Farm cannot access the database directly. A special Online implementation of the CondDB service is thus needed under supervision of the LHCb Control system.
        Speaker: Marco Clemencic (CERN)
        Paper
        Slides
      • 150
        ROOT I/O for SQL databases
        ROOT already has powerful and flexible I/O, which potentially can be used for storage of objects data in SQL databases. Usage of ROOT I/O together with SQL database will provide advanced functionality like: guarantee of data integrity, logging of data changes, possibility to rollback changes and lot of other features, provided by modern databases. At the same time data representation in SQL tables is main issue. To be able navigate and access data from non-ROOT environment objects data should be presented rather in native data types (integer, float, text) and not as set of BLOBs (binary block of data). A new TSQLFile class will be presented. It implements standard ROOT TFile interface for storage of object in human readable format in SQL database. Different possibilities for table design will be discussed.
        Speaker: Dr Sergey Linev (GSI DARMSTADT)
        Paper
        Slides
    • Software Tools and Information Systems: STIS-2 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 151
        Web Access to Cms Tracker Monitoring Data with an AJAX Interface
        The CMS tracker has more than 50 millions channels organized in 16540 modules each one being a complete detector. Its monitoring requires the creation, analysis and storage of at least 4 histograms per module to be done every few minutes. The analysis of these plots will be done by computer programs that will check the data against some reference plots and send alarms to the operator in case of problems. Fast visual access to these plots by the operator is then essential to diagnose and solve the problem. We propose here a graphics interface to these plots similar to the one used by Google Maps to access satellite images (which uses the AJAX technique). We start with a "tracker map" i.e. an image that contains in a single screen a planar visualization of all modules. The operator can zoom on this image accessing first histogram miniatures drawn on top of the modules and then the full histograms. As in Google Maps the user can pan the image at any resolution and can switch between different representations showing different kind of plots.
        Speaker: Mr Giulio Eulisse (Northeastern University, Boston)
      • 152
        ROOT GUI, General Status
        ROOT as a scientific data analysis framework provides a large selection data presentation objects and utilities. The graphical capabilities of ROOT range from 2D primitives to various plots, histograms, and 3D graphical objects. Its object- oriented design and developments offer considerable benefits for developing object- oriented user interfaces. The ROOT GUI classes support an extensive and rich set of widgets that allow an easy way to develop cross-platform GUI applications with a common look and feel. The object-oriented, event-driven programming model supports the modern signals/slots communication mechanism. This mechanism is an advanced object communication concept; it largely replaces the concept of callback functions to handle actions in GUIs. Signals and slots are just like any object-oriented methods implemented in C++. It uses ROOT dictionary information and CINT interpreter to connect signals to slots in ROOT. The progress of the recent user interface developments in ROOT are presented in this paper.
        Speakers: Fons Rademakers (CERN), Fons Rademakers (CERN)
        Paper
        Slides
      • 153
        From Task Analysis to the Application Design
        One of the main design challenges is the task of selecting appropriate Graphical User Interface (GUI) elements and organizing them to meet successfully the application requirements. - How to choose and assign the basic user interface elements (so-called widgets from `window gadgets') into the single panels of interactions? - How to organize these panels to appropriate levels of the application structure? - How to map the task sequence to these application levels? - How to manage the hierarchy of this structure with respect to different user profiles? - What information is absolutely necessary, what can be left out? The answers to all these questions are the subject of this paper.
        Speaker: Mr Fons Rademakers (CERN)
        Paper
        Slides
      • 154
        ROME - a universally applicable analysis framework generator
        This talk presents a new approach of writing analysis frameworks. We will point out a way of generating analysis frameworks out of a short experiment description. The generation process is completely experiment independent and can thus be applied to any event based analysis. The presentation will focus on a software package called ROME. This software generates analysis frameworks which are fully object oriented and based upon ROOT. The frameworks feature root functionality, sql database access, socket connections to GUI applications and connections to DAQ systems. ROME is currently used at PSI, Switzerland, Los Alamos National Laboratories, USA and Triumf, Canada.
        Speaker: Mr Matthias Schneebeli (Paul Scherrer Institute, Switzerland)
        Paper
        Slides
      • 155
        The LCG SPI project in LCG Phase II
        In the context of the LCG Applications Area the SPI, Software Process and Infrastructure, project provides several services to the users in the LCG projects and the experiments (mainly at the LHC). These services comprise the CERN Savannah bug-tracking service, the external software service, and services concerning configuration management and applications build, as well as software testing and quality assurance. For the future phase II of the LCG the scope and organization of the SPI activities will be adjusted, adding emphasis to provide services to the experiments. The actual status of the services and their future evolution will be presented.
        Speaker: Dr Andreas Pfeiffer (CERN, PH/SFT)
        Slides
    • Plenary: Plenary 3 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Kors Bos (NIKHEF)
      • 156
        Next Generation DAQ Systems Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Beat Jost (CERN)
        Slides
      • 157
        Event Processing Frameworks; a Social and Technical Challenge Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Elizabeth Sexton-Kennedy (FNAL)
        Minutes
        Slides
      • 158
        LHC-Era data rates in 2004 and 2005 - Experiences of the PHENIX Experiment with a PetaByte of Data Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Martin Purschke (BNL)
        Slides
    • Poster: Poster 1
    • 10:30
      Tea Break
    • Plenary: Plenary 4 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Richard Mount (SLAC)
      • 159
        e-Science and Cyberinfrastructure Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Tony Hey (Microsoft, UK)
        Slides
      • 160
        MySQL and Scalable Databases Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr David Axmark (MySQL)
      • 161
        High Performance Computing - Accelerating Discovery Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Alan Gara (IBM T. J. Watson Research Center)
    • 12:30
      Lunch Break
    • Computing Facilities and Networking: CFN-3 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 162
        High End Visualization with Scalable Display System
        Today we can have huge datasets resulting from computer simulations (CFD, physics, chemistry etc) and sensor measurements (medical, seismic and satellite). There is exponential growth in computational requirements in scientific research. Modern parallel computers and Grid are providing the required computational power for the simulation runs. The rich visualization is essential in interpreting the large, dynamic data generated from these simulation runs. The visualization process maps these datasets onto graphical representations and then generates the pixel representation. The large number of pixels shows the picture in greater details and interaction with it enables the greater insight on the part of user in understanding the data more quickly, picking out small anomalies that could turn out to be critical and make better decisions. However, the memory constraints, lack of the rendering power and the display resolution offered by even the most powerful graphics workstation makes the visualization of this magnitude difficult or impossible. The initiative to develop high end visual environment at Computer Division, BARC explores how to build and use a scalable display system for visual intensive applications by tiling multiple LCD displays driven by the Linux based PC graphics-rendering cluster. We are using the commodity off-the-shelf components such as PCs, PC graphics accelerators, network components and LCD displays. This paper focuses on building an environment which render and drive over 20 millions of pixels, using the open source software framework. We describe the software packages developed for such a system and its use to visualize data generated by computational simulations and applications requiring higher intrinsic display resolution and more display space.
        Speaker: Mr Dinesh Sarode (Computer Division, BARC, Mumbai-85, India)
        Paper
        Slides
      • 163
        Performance Analysis of Linux Networking – Packet Receiving
        The computing models for HEP experiments are becoming ever more globally distributed and grid-based, both for technical reasons (e.g., to place computational and data resources near each other and the demand) and for strategic reasons (e.g., to leverage technology investments). To support such computing models, the network and end systems (computing and storage) face unprecedented challenges. One of the biggest challenges is to transfer physics data sets – now in the multi-petabyte range and expected to grow to exabytes within a decade – reliably and efficiently among facilities and computation centers scattered around the world. Both the network and end systems should be able to provide the capabilities to support high bandwidth, sustained, end-to-end data transmission. Recent trends in technology are showing that although the raw transmission speeds used in networks are increasing rapidly, the rate of advancement of microprocessor technology has slowed down over the last couple of years. Therefore, network protocol-processing overheads have risen sharply in comparison with the time spent in packet transmission, resulting in the degraded throughput for networked applications. More and more, it is the network end system, instead of the network, that is responsible for degraded performance of network applications. In this paper, the Linux system’s packet receive process is studied from NIC to application. We develop a mathematical model to characterize the Linux packet receive process. Key factors that affect Linux systems’ network performance are analyzed.
        Speaker: Dr Wenji Wu (Fermi National Accelerator Laboratory)
        Paper
        Slides
      • 164
        Network Information and Monitoring Infrastructure (NIMI)
        Fermilab is a high energy physics research lab that maintains a dynamic network which typically supports around 10,000 active nodes. Due to the open nature of the scientific research conducted at FNAL, the portion of the network used to support open scientific research requires high bandwidth connectivity to numerous collaborating institutions around the world, and must facilitate convenient access by scientists at those institutions. Network Information and Monitoring Infrastructure (NIMI) is a framework built to help network management personnel and the computer security team monitor and manage the FNAL network. This includes the portions of the network used to support open scientific research as well as the portions for more tightly controlled administrative and scientific support activities. NIMI has been used to build such applications as Node Directory, Network Inventory Database and Computer Security Issue Tracking System (TIssue). These applications have been successfully used by FNAL Computing Division personnel to manage local network, maintain necessary level of protection of LAN participants against external threats and promptly respond to computer security incidents. The article will discuss NIMI structure, functionality of major NIMI-based applications, history of the project, its current status and future plans.
        Speaker: Igor Mandrichenko (FNAL)
        Slides
      • 165
        TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Peta-scale Computing Research
        A DOE MICS/SciDac funded project, TeraPaths, deployed and prototyped the use of differentiated networking services based on a range of new transfer protocols to support the global movement of data in the high energy physics distributed computing environment. While this MPLS/LAN QoS work specifically targets networking issues at BNL, the experience acquired and expertise developed is expected to be more globally applicable to ATLAS and the high-energy physics community in general. TeraPaths is used to dedicate a fraction of the available network bandwidth to ATLAS Tier 1 data movement and limit its disruptive impact on BNL's heavy ion physics program and other more general laboratory network needs. We developed a web service based software system to automate the QoS configuration in LAN paths and negotiate network bandwidth with remote network domains on behalf of end users. Our system architecture could be easily integrated with other network management tools to provide a complete end-to-end QoS solution. We demonstrated TeraPaths' effectiveness in data transfer activities in Brookhaven National Lab. Our future work will focus on strategically scheduling network resource to shorten the transfer time for mission critical data relocation, thus reducing the error rates which are proportional to the transfer time. We will manage network resources which typically span many administrative domains, a unique characteristic compared with CPU and storage resource. The overall goal remains providing a robust, effective network infrastructure for High Energy and Nuclear Physics.
        Speakers: Dr Dantong Yu (BROOKHAVEN NATIONAL LABORATORY), Dr Dimitrios Katramatos (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
    • Distributed Data Analysis: DDA-3 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 166
        ARDA experience in collaborating with the LHC experiments
        The ARDA project focuses in delivering analysis prototypes together with the LHC experiments. Each experiment prototype is in principle independent but commonalities have been observed. The first level of commonality is represented by mature projects which can be effectively shared across different users. The best example is GANGA, providing a toolkit to organize users’ activity, shielding users from execution back end details (like JDL preparation) and optionally supporting the execution of user application derived from the experiment framework (Athena and DaVinci for ATLAS and LHCb). The second level derives from the observation of commonality among different usage of the Grid: efficient access to resources from individual users, interactivity, robustness and transparent error recovery. High-level services built on top of a baseline layer are frequently needed to fully support specific activities like production and users’ analysis: these high-level services can be regarded as prototypes of future generic services. The observed commonality and concrete examples of convergence in the HEP community and outside are shown and discussed.
        Speaker: Dr Massimo Lamanna (CERN)
        Paper
        Slides
      • 167
        Evolution of BOSS, a tool for job submission and tracking
        BOSS (Batch Object Submission System) has been developed to provide logging and bookkeeping and real-time monitoring of jobs submitted to a local farm or a grid system. The information is persistently stored in a relational database for further processing. By means of user-supplied filters, BOSS extracts the specific job information to be logged from the standard streams of the job itself and stores it in the database in a structured form that allows easy and efficient access. BOSS has been used since 2002 for CMS Monte Carlo productions and is being re-engeneered to satisfy the needs of user analysis in highly distributed environment. The new architecture has the concept of composite jobs and of job clusters (Tasks) and benefits from a factorization of the monitoring system and the job archive.
        Speaker: Mr stuart WAKEFIELD (Imperial College, University of London, London, UNITED KINGDOM)
        Paper
        Slides
      • 168
        Interactive Web-based Analysis Clients using AJAX: with examples for CMS, ROOT and GEANT4
        We describe how a new programming paradigm dubbed AJAX (Asynchronous Javascript and XML) has enabled us to develop highly-performant web-based graphics applications. Specific examples are shown of our web clients for: CMS Event Display (real-time Cosmic Challenge), remote detecotr monitoring with ROOT displays, and performat 3D displays of GEANT4 descriptions of LHC detectors. The Web-client performance can be comparable to a local application. Moreover the web client does not suffer from any of problems of software packaging, distribution and installation and configuration for multiple platforms. AJAX, which uses a mixture of Javascript, XML and DHTML, is spreading rapidly, helped by its support from industry leaders such as Google, Yahoo and Amazon. We describe how AJAX improves on the traditional post/reload web-page mechanisms, by supporting individual updates of sub-components of web pages and explain how we exploited these design patterns to develop real web-based High Energy Physics applications.
        Speaker: Mr Giulio Eulisse (Northeastern University, Boston)
        Paper
        Slides
      • 169
        DIRAC Infrastructure for Distributed Analysis
        DIRAC is the LHCb Workload and Data Management system for Monte Carlo simulation, data processing and distributed user analysis. Using DIRAC, a variety of resources may be integrated, including individual PC's, local batch systems and the LCG grid. We report here on the progress made in extending DIRAC for distributed user analysis on LCG. In this paper we describe the advances in the workload management paradigm for analysis with computing resource reservation by means of Pilot Agents. This approach allows DIRAC to mask any inefficiencies of the underlying Grid from the user thus increasing the effective performance of the distributed computing system. The modular design of DIRAC at every level lends the system intrinsic flexibility. The possible strategy for the evolution of the system will be discussed. The DIRAC API consolidates new and existing services and provides a transparent and secure way for users to submit jobs to the Grid. Jobs find their input data by interrogating the LCG File Catalogue which the LCG Resource Broker also uses to determine suitable destination sites. While it may be exploited directly by users, it also serves as the interface for the GANGA Grid front-end to perform distributed user analysis for LHCb. DIRAC has been successfully used to demonstrate distributed data analysis on LCG for LHCb. The system performance results are presented and the experience gained is discussed.
        Speaker: Mr Stuart Paterson (University of Glasgow / CPPM, Marseille)
        Paper
        Slides
    • Distributed Event production and Processing: DEPP-3 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 170
        GlideCAF - A Late-binding Approach to the Grid
        Higher instantaneous luminosity of the Tevatron Collider forces large increases in computing requirements for CDF experiment which has to be able to cover future needs of data analysis and MC production. CDF can no longer afford to rely on dedicated resources to cover all of its needs and is therefore moving toward shared, Grid, resources. CDF has been relying on a set of CDF Analysis Farms (CAFs), dedicated pools of commodity nodes managed as Condor pools, with a small CDF specific software stack on top of it. We have extended this model by using the Condor glide-in mechanism that allows for the creation of dynamic Condor pools on top of existing batch systems, without the need to install any additional software. The GlideCAF is essentially a CAF plus the tools needed to keep the dynamic pool alive. All the monitoring tools supported on the dedicated resource CAFs, including semi-interactive access to the running jobs and detailed monitoring, have been preserved. In this talk, we present the problems we have encountered during the implementation of glide-in based Condor pools and the challenges we have in maintaining them. We also show the amount of resources we manage with this technology and how much we have gained through it.
        Speaker: Subir Sarkar (INFN-CNAF)
        Paper
        Slides
      • 171
        A generic approach to job tracking for distributed computing: the STAR approach
        Job tracking, i.e. monitoring bundle of jobs or individual job behavior from submission to completion, is becoming very complicated in the heterogeneous Grid environment. This paper presents the principles of an integrating tracking solution based on components already deployed at STAR, none of which are experiment specific: a Generic logging layer and the STAR Unified Meta-Scheduler (SUMS). The first component is a "generic logging layer" built on the top of the logger family derived from the Jakarta "log4j" project that includes the "log4cxx", "log4c" and "log4perl" packages. These layers provide consistency across packages, platforms and frameworks. SUMS is a "generic" gateway to user batch-mode analysis and allows the user to describe tasks based on an abstract job description language (SUMS’s architecture was designed around a module plug-and-play philosophy and is therefore not experiment specific). We discuss how the tracking layer utilizes a unique ID generated by SUMS for each task it handles and the set of jobs it creates; how it is used for creating and updating Job records in the administrative database along with other vital job related information. Our approach does not require users to introduce any additional key to identify and associate the job with the database tables as the tree structure of information is handled automatically. Representing (sets of) jobs in a database makes easy to implement management, scheduling, and query operations as the user may list all previous jobs and get the details of status, time submitted, started, finished, etc…
        Speaker: Dr Valeri FINE (BROOKHAVEN NATIONAL LABORATORY)
        Slides
      • 172
        Grid Deployment Experiences: The current state of grid monitoring and information systems.
        As a result of the interoperations activity between LHC Computing Grid (LCG) and Open Science Grid (OSG), it was found that the information and monitoring space within these grids is a crowded area with many closed end-to-end solutions that do not interoperate. This paper gives the current overview of the information and monitoring space within these grids and tries to find overlapping areas that could be standardized. The idea of using a single interface and schema for information is investigated and a solution is presented along with a proposal for using this solution within both LCG and OSG.
        Speaker: Mr Laurence Field (CERN)
        Paper
        Slides
    • Event Processing Applications: EPA-3 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 173
        CMS Detector and Physics Simulation
        The CMS simulation based on the Geant4 toolkit and the CMS object-oriented framework has been in production for almost two years and has delivered a total of more than a 100 M physics events for the CMS Data Challenges and Physics Technical Design Report studies. The simulation software has recently been successfully ported to the new CMS Event-Data-Model based software framework. In this paper, we present the experience from two years in physics production, the migration process to the new architecture and some newly-commissioned features for specific studies (e.g. exotic particles) and different operational scenarios in terms of hit simulation, event mixing and digitization.
        Speaker: Dr Maya Stavrianakou (FNAL)
        Slides
      • 174
        The new ATLAS Fast Track Simulation engine (FATRAS)
        Various systematic physics and detector performance studies with the ATLAS detector require very large event samples. To generate those samples, a fast simulation technique is used instead of the full detector simulation, which often takes too much effort in terms of computing time and storage space. The widely used ATLAS fast simulation program ATLFAST, however, is based on intial four momentum smearing and does not allow tracking detector studies on hit level. Alternatively, the new ATLAS Fast Track Simulation engine (FATRAS) that comes intrinsically with the recently developed track extrapolation package is capable of producing full track information, including hits on track. It is based on the reconstruction geometry and the internal navigation of the track extrapolation package that has been established in the restructured ATLAS offline reconstruction chain. Its modular design allows easy control of the inert material, detector resolutions and acceptance, the magnetic field configuration and the general noise level. The application of the FATRAS simulation in a systematic detector performance study as well as a physics analysis will be presented.
        Speaker: Mr Andreas Salzburger (UNIVERSITY OF INNSBRUCK)
        Paper
        Paper_PDF
        Slides
      • 175
        GFLASH - parameterized electromagnetic shower in CMS
        An object-oriented package for parameterizing electromagnetic showers in the framework of the Geant4 toolkit has been developed. This parameterization is based on the algorithms in the GFLASH package (implemented in Geant3 / FORTRAN), but has been adapted to the new simulation context of Geant4. This package can substitute the full tracking of high energy electrons/positrons(normally form above 800 MeV ) inside Geant4 with the probability density function of the shower profile. A mixture of full simulation and fast parameterization is also possible. This new implementation of the GFLASH package leads to a significant gain in simulation time for pp events at 14 TeV at the LHC, without sacrificing too much the simulation accuracy and can be used for any homogenous calorimeter. GFLASH has been also included into the GEANT 4.7 release and the CMS detector imulation OSCAR, which is based on Geant4. Some GFLASH parameters has been also tuned to achieve better agreement with the CMS electromagnetic calorimeter. Comparisons between GFLASH and full simulation in timing and physics performance will be presented as well.
        Speaker: Joanna Weng (Karlsruhe/CERN)
        eps
        eps
        eps
        Paper
        pdf
        pdf
        Slides
        tex
      • 176
        The ATLAS Event Data Model
        The event data model (EDM) of the ATLAS experiment is presented. For large collaborations like the ATLAS experiment common interfaces and data objects are a necessity to insure easy maintenance and coherence of the experiments software platform over a long period of time. The ATLAS EDM improves commonality across the detector subsystems and subgroups such as trigger, test beam reconstruction, combined event reconstruction, and physics analysis. Furthermore the EDM allows the use of common software between online data processing and offline reconstruction. One important task of the EDM group is to provide know-how and the infrastructure to secure the accessibility of data even after changes to the data model. New processes have been put into place to manage the decoupling of the persistent (on disk) storage and the transient (in memory), and how to handle requests from developers to change or add to the stored data model.
        Speaker: Dr Edward Moyse (University of Massachusetts)
        Paper
        Paper sources
        Paper sources
        Slides
      • 177
        The New CMS Event Data Model and Framework
        The new CMS Event Data Model and Framework that will be used for the high level trigger, reconstruction, simulation and analysis is presented. The new framework is centered around the concept of an Event. A data processing job is composed of a series of algorithms (e.g., a track finder or track fitter) that run in a particular order. The algorithms only communicate via data stored in the Event. To facilitate testing, all data items placed in the Event are storable to ROOT/IO using POOL. This allows one to run a partial job (e.g., just track finding) and check the results without having to go through any further processing steps. In addition, the POOL/ROOT files generated by the new framework are directly browseable in ROOT. This allows one to accomplish simple data analyses without any additional tools. More complex studies can be supported in ROOT just by loading the appropriate shared libraries which contain the dictionaries for the stored objects. By taking the time now before data taking has begun to re-engineer the core framework, CMS hopes to provide a clean system that will serve it well for the decades to come.
        Speaker: Dr Christopher Jones (CORNELL UNIVERSITY)
        Paper
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-3 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 178
        Operations structure for the management, control and support of the INFN-GRID/Grid.It production infrastructure
        Moving from a National Grid Testbed to a Production quality Grid service for the HEP applications requires an effective operations structure and organization, proper user and operations support, flexible and efficient management and monitoring tools. Moreover the middleware releases should be easily deployable using flexible configuration tools, suitable for various and different local computing farms. The organizational model, the available tools and the agreed procedures for operating the national/regional grid infrastructures that are part of the world-wide EGEE grid as well as the interconnection of the regional operations structures with the global management, control and support structure play a key role for the success of a real production grid. In this paper we describe the operations structure that we are currently using at the Italian Grid Operation and Support Center. The activities described cover monitoring, management and support for the INFN-GRID/Grid.It production grid (spread over more than 30 sites) and its interconnection with the EGEE/LCG structure as well as the roadmap to improve the global support quality, stability and reliability of the Grid service.
        Speaker: Dr Maria Cristina Vistoli (Istituto Nazionale di Fisica Nucleare (INFN))
        Paper
        Slides
      • 179
        Integrating a heterogeneous and shared computer cluster into grids
        Computer clusters at universities are usually shared among many groups. As an example, the Linux cluster at the "Institut fuer Experimentelle Kernphysik" (IEKP), University of Karlsruhe, is shared between working groups of the high energy physics experiments AMS, CDF and CMS, and has successfully been integrated into the SAM grid of CDF and the LHC computing grid LCG for CMS while it still supports local users. This shared usage of the cluster effects heterogeneous software environments, grid middleware and access policies. Within the LCG, the IEKP site realises the concept of a Tier-2/3 prototype center. The installation procedure and setup of the LCG middleware has been modified according to the local conditions. With this dedicated configuration, the IEKP site offers the full grid functionality such as data transfers, CMS software installation and grid based physics analyses. The need for prioritisation of certain user groups has been satisfied by supporting different virtual organisations. The virtualisation of the LCG components, which can improve the utilisation of resources and security aspects, will be implemented in the near future.
        Speaker: Anja Vest (University of Karlsruhe)
        Paper
        Slides
      • 180
        Grid Deployment Experiences: The interoperations activity between OSG and LCG.
        Open Science Grid (OSG) and LHC Computing Grid (LCG) are two grid infrastructures that were built independently on top of a Virtual Data Toolkit (VDT) core. Due to the demands of the LHC Virtual Organizations (VOs), it has become necessary to ensure that these grids interoperate so that the experiments can seamlessly use them as one resource. This paper describes the work that was necessary to achieve interoperability and the challenges that were faced. Problems overcome and solutions are discussed along with a recommendation on what are the critical components required for retaining interoperabilty. The current state of the interoperations actively is described, an outline for the future direction.
        Speaker: Mr Laurence Field (CERN)
        Paper
        Slides
      • 181
        Grid Data Management: Reliable File Transfer Services' Performace
        Data management has proved to be one of the hardest jobs to do in a the grid environment. In particular, file replication has suffered problems of transport failures, client disconnections, duplication of current transfers and resultant server saturation. To address these problems the globus and gLite grid middlewares offer new services which improve the resiliancy and robustness of file replication on the grid. gLite has the File Transfer Service (FTS) and globus offers Reliable File Transfer (RFT). Both of these middleware components offer clients a web service interface to which they can submit a request to copy a file from one grid storage element to another. Clients can then return to the web service to query the status of their requested transfer, while the services can schedule, load balance and retry failures between the recieved requests. In this paper we compare these two services, examining, a) Architecture and features offered to clients and grid infrastructure providers. b) Robustness under load: e.g., when large numbers of clients attempt to connect in a short time or large numbers of transfers are scheduled at once. c) Behaviour under common failure conditions - loss of network connectivity, failure of backend database, sudden client disconnections. Lessons learned in the deployment of gLite FTS during LCG Service Challenge 3 are also discussed. Finally, further development of higher level data management services, including interaction with catalogs in gLite File Placement Service and Globus Data Replication Service is considered.
        Speaker: Dr Graeme A Stewart (University of Glasgow)
        Paper
        Slides
    • Online Computing: OC-3 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 182
        The ATLAS Data acquisition and High-Level Trigger: concept, design and status.
        The Trigger and Data Acquisition system (TDAQ) of the ATLAS experiment at the CERN Large Hadron Collider is based on a multi-level selection process and a hierarchical acquisition tree. The system, consisting of a combination of custom electronics and commercial products from the computing and telecommunication industry, is required to provide an online selection power of 105 and a total throughput in the range of Terabit/sec. The concept and design of the ATLAS TDAQ have been developed to take maximum advantage of the physics nature of very high-energy hadron interactions. The trigger system is implemented to provide a background rejection of one to two orders of magnitude before the events are fully reconstructed. The Region-of-Interest (RoI) mechanism is used to minimise the amount of data needed to calculate the trigger decisions thus reducing the overall network data traffic considerably. The final system will consist of a few thousands processors, interconnected by multi-layer Gbit Ethernet networks. The selection and data acquisition software has been designed in-house, based on industrial technologies (such as CORBA, CLIPS and Oracle). Software releases are produced on a regular basis and exploited on a number of test beds as well as for detector data taking in test labs and test beams. This paper introduces the basic system requirements and concepts. describes the architecture and design of the system and reports on the actual status of construction. It serves as introduction to the functionally and performance measurements made on large-scale test systems (LST) and on the TDAQ Pre-series installation, reported in separate papers at this conference.
        Speaker: Dr Benedetto Gorini (CERN)
        Paper
        Slides
      • 183
        Online reconstruction of a TPC as part of the continuous sampling DAQ of the PANDA experiment.
        PANDA is a universal detector system, which is being designed in the scope of the FAIR-Project at Darmstadt, Germany and is dedicated to high precision measurements of hadronic systems in the charm quark mass region. At the HESR storage ring a beam of antiprotons will interact with internal targets to achieve the desired luminosity of 2x10^32cm^-2s^-1. The experiment is designed for event rates of up to 2x10^7s^-1. To cope with such high rates a new concept of data acquisition will be employed: the triggerless continuous sampling DAQ. Currently it is being investigated if a time projection chamber (TPC) will fulfill the requirements of the central tracking device and consequently what role it will have in the final design of the detector. The proposed TPC would have an expected raw data rate of up to 400 GB/s. Extensive online data processing is needed for data reduction and flexible online event selection. Our goal is to reach a compression factor of the raw data rate of at least 10 by exploiting the known data topology through feature extraction algorithms such as tracklet reconstruction or hit train compression. The full reconstruction of events on the fly is a key technology for the operation of a TPC in continous mode. This talk describes the conceptual design of the online reconstruction for this detector. Results of prototype algorithms and simulations will be shown.
        Speaker: Mr Sebastian Neubert (Technical University Munich)
        Slides
      • 184
        ALICE High Level Trigger interfaces and data organisation
        The HLT, integrating all major detectors of ALICE, is designed to analyse LHC events online. A cluster of 400 to 500 dual SMP PCs will constitute the heart of the HLT system. To synchronize the HLT with the other online systems of ALICE (Data Acquisition (DAQ), Detector Control System (DCS), Trigger (TRG)) the Experiment Control System (ECS) has to be interfaced. In order to do so, the implementation of finite state machines, with the usage of SMI++, offers an easy way to coordinate the running conditions of the HLT with the other online systems. The mapping of the HLT states has to offer the possibility to run the HLT in stand alone mode (for commissioning and calibration) as well as controlled by ECS during the operational stage. After a full reconstruction of an event, the HLT provides Trigger decisions, regions-of-interest (ROI) and compressed data to the DAQ in order to reduce the data rate to permanent storage. In addition, the result of the online event reconstruction, organised as ESD (Event Summary Data), has to be classified and indexed for later reference to the offline analysis. This talk will cover HLT interfaces as well as the structure of the processed data.
        Speaker: Sebastian Robert Bablok (Department of Physics and Technology, University of Bergen, Norway)
        Paper
        Slides
      • 185
        Status and Performance of the DØ Level 3 Trigger/DAQ System
        DØ, one of two collider experiments at Fermilab's Tevatron, upgraded its DAQ system for the start of Run II. The run started in March 2001, and the DAQ system was fully operational shortly afterwards. The DAQ system is a fully networked system based on Single Board Computers (SBCs) located in VME readout crates which forward their data to a 250 node farm of commodity processors for trigger selection under the control of a routing node interfaced to DØ's lower level hardware triggers. After a slow start, the Tevatron has made a great deal of progress towards its luminosity goals and, recently, has been breaking records almost every month. The Level 3 Trigger/DAQ system has survived the increased demands placed on it by the high luminosity running, but not without some modification and improvements. The basic design, performance, and improvements will be discussed in this report, highlighting the problems, scalability and configuration issues, and solutions we've encountered with the high luminosity running.
        Speaker: Gordon Watts (University of Washington)
        Slides
    • Software Components and Libraries: SCL-3 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 186
        Reflex, reflection for C++
        Reflection is the ability of a programming language to introspect and interact with it's own data structures at runtime without prior knowledge about them. Many recent languages (e.g. Java, Python) provide this ability inherently but it is lacking for C++. This paper will describe a software package, Reflex, which provides reflection capabilities for C++. Reflex was developed in the context of the LCG Applications Area at CERN. The package tries to comply fully to the ISO/IEC standard for C++ which was taken as the main design guideline. In addition it is light, standalone and non intrusive towards the user code. This paper will focus on the user API of the package and its underlying design issues, the way to generate reflection information from arbitrary C++ definitions and recent additions. Reflex has been adapted by several projects at CERN e.g. POOL, RAL, COOL. Recently Reflex started to be integrated with the ROOT data analysis framework where it will strongly collaborate with the CINT interpreter. An overview of the plans and developments in this area will be discussed. An outlook to possible further modifications e.g. IO/Persistency, Python bindings, plugin management will be given.
        Speaker: Dr Stefan Roiser (CERN)
        Paper
        Slides
      • 187
        C++ introspection with JIL
        The JLab Introspection Library (JIL) provides a level of introspection for C++ enabling object persistence with minimal user effort. Type information is extracted from an executable that has been compiled with debugging symbols. The compiler itself acts as a validator of the class definitions while enabling us to avoid implementing an alternate C++ preprocessor to generate dictionary information. The dictionary information is extracted from the executable and stored in an XML format. C++ serializer methods are then generated from the XML. The advantage of this method is that it allows object persistence to be incorporated into existing projects with minimal or, in some cases, no modification of the existing class definitions (e.g. storing only public data members). While motivated by a need to store high volume event-based data, configurable features allow for easy customization making it a powerful tool for a wide variety of projects.
        Speaker: Dr David Lawrence (Jefferson Lab)
        Paper
        Slides
      • 188
        Concepts, Developments and Advanced Applications of the PAX Toolkit
        Physics analyses at modern collider experiments enter a new dimension of event complexity. At the LHC, for instance, physics events will consist of the final state products of the order of 20 simultaneous collisions. In addition, a number of today’s physics questions is studied in channels with complex event topologies and configuration ambiguities occurring during event analysis. The Physics Analysis eXpert toolkit (PAX) is a continuously maintained and advanced C++ class collection, specially designed to assist physicists in the analysis of complex scattering processes. PAX allows definition of an abstraction layer beyond detector reconstruction by providing a generalized, persistent HEP event container with three types of physics objects (particles, vertices and collisions), relation management and file I/O scheme. The PAX event container is capable of storing the complete information of multi-collision events (including decay trees with spatial vertex information, four-momenta as well as additional reconstruction data). An automated copy functionality for the event container allows the analyst to consistently duplicate event containers for hypothesis evolution, including its physics objects and relations. PAX physics objects can hold pointers to an arbitrary number of instances of arbitrary C++ classes, allowing the analyst to keep track of the data origin within the detector reconstruction software. Further advantages arising from the usage of the PAX toolkit are a unified data model and nomenclature, and therefore increased code lucidity and more efficient team work. The application of the generalized event container provides desirable side-effects, such as protection of the physics analysis code from changes in the underlying software packages and avoidance of code duplication by the possibility of applying the same analysis code to various levels of input data. We summarize basic concepts and class structure of the PAX toolkit, and report about the developments made for the recent release version (2.00.10). Finally, we present advanced applications of the PAX toolkit, as in use at searches and physics analyses at Tevatron and LHC.
        Speaker: Dr Steffen G. Kappler (III. Physikalisches Institut, RWTH Aachen university (Germany))
        Paper
        Slides
      • 189
        ATLAS Physics Analysis Tools
        The physics program at the LHC includes precision tests of the Standard Model (SM), the search for the SM Higgs boson up to 1 TeV, the search for the MSSM Higgs bosons in the entire parameter space, the search for Super Symmetry, sensitivity to alternative scenarios such as compositeness, large extra dimensions, etc. This requires general purpose detectors with excellent performance. ATLAS is one such detectors under construction for the LHC. Data taking is expected to start in April 2007. The detector performance and the prospects for discoveries are studied in various physics and detector performance working groups. The ATLAS offline computing system includes the development of common tools and of a framework for analysis. Such a development consists of studying the different approaches to the analysis domain, in order to identify commonalities, and to propose a basline unified framework for analysis in collaboration with the various software and computing groups, the physics and detector performance working groups integrating feedback from the user community. In this talk, we will review the common tools, event data formats for analysis, and the activities toward the Analysis Model.
        Speakers: Dr Ketevi Adikle Assamagan (Brookhaven National Laboratory), PAT ATLAS (ATLAS)
    • Software Tools and Information Systems: STIS-3 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 190
        Worm and Peer To Peer Distribution of ATLAS Trigger & DAQ Software to Computer Clusters
        ATLAS Trigger & DAQ software, with six Gbytes per release, will be installed in about two thousand machines in the final system. Already during the development phase, it is tested and debugged in various Linux clusters of different sizes and network topologies. For the distribution of the software across the network there are, at least, two possible aproaches: fixed routing points, and adaptive distribution. The first one has been implemented with the SSH worm Nile. It is a utility to launch connections in a mixture of parallel and cascaded modes, in order to synchronize software repositories incrementally or to execute commands. A system administrator configures, in a single file, the routes for the propagation. Therefore it achieves scalable delivery, as well as being efficiently adapted to the network. The installation of Nile is trivial, since it is able to replicate itself to other computers memory, being implemented as a worm. Moreover, the utilization of routing and status monitoring protocols together with an adaptive runtime algorithm to compensate for broken paths, make it very reliable. The other aproach, adaptive distribution, is implemented with peer to peer protocols, or P2P. In these solutions, a node interested in a file acts as both client and server for small pieces of the file. The strength of the P2P comes from the adaptive algorithm that is run in every peer. Its goal is to maximize the peer's own throughput, and the the overall throughput of the network. Hence the network resources are used efficiently, with no configuration effort. The selected tool in this case is BitTorrent. This paper describes tests performed in CERN clusters of 50 to 600 nodes, with both technologies and compares the benefits of each.
        Speaker: Hegoi Garitaonandia Elejabarrieta (Instituto de Fisica de Altas Energias (IFAE))
        Paper
        Slides
      • 191
        The Capone Workflow Manager
        We describe the Capone workflow manager which was designed to work for Grid3 and the Open Science Grid. It has been used extensively to run ATLAS managed and user production jobs during the past year but has undergone major redesigns to improve reliablility and scalability as a result of lessons learned (cite Prod paper). This paper introduces the main features of the new system covering job management, monitoring, troublehsooting, debugging and job logging. Next, the modular architecture which implements several key evolutionary changes to the system is described: a multi-threaded pool structure, checkpointing mechanisms, and robust interactions with external components, all developed to address scalability and state persistence issues uncovered during operations running of the production system. Finally, we describe the process of delivering production ready tools, provide results from benchmark stress tests, and compare Capone with other workflow managers in use for distributed production systems.
        Speaker: Marco Mambelli (UNIVERSITY OF CHICAGO)
        35-Capone-pdf-v12
        35-Capone-src-v12
        Slides
      • 192
        Testing Grid Software: The development of a distributed screen recorder to enable front end and usability testing
        Ongoing research has shown that testing grid software is complex. Automated testing mechanisms seem to be widely used, but are critically discussed on account of their efficiency and correctness in finding errors. Especially when programming distributed collaborative systems, structures get complex and systems get more error-prone. Past projects done by the authors have shown that the most important part of the tests seem to be tests conducted in a test-bed. However, reconstructing errors was nearly impossible. The researchers have developed a distributed screen recorder as proof-of-concept, which enables the tester to record screens in different locations. The playback is synchronous and can therefore be used to easily reconstruct moments in time, for example when errors have occurred. Additionally, the screen recorder allows conducting usability tests of distributed applications by recording web cam pictures of the user. The application will make front-end and usability testing of Grid applications easier and more efficient.
        Speaker: Mr Florian Urmetzer (Research Assistant in the ACET centre, The University of Reading, UK)
        Paper
        Slides
      • 193
        The CMS Simulation Validation Suite
        Monte Carlo simulations are a critical component of physics analysis in a large HEP experiment such as CMS. The validation of the simulation sofware is therefore essencial to guarantee the quality and accuracy of the Monte Carlo samples. CMS is developing a Simulation Validation Suite (SVS) consisting of a set of packages associated with the different sub-detector systems: tracker, electromagnetic calorimeter, hadronic calorimeter, and muon detector. The Suite also contains packages to verify detector geometry parameters, and the magnetic field. Each package consists of one or more tests running on single particle or collider samples, producing distributions of validation quantities which are checked against reference values. The tests are performed at different levels or modes, to verify from the basic objects such as "hits" to more complex physics quantities such as resolutions, and shower profiles.
        Speaker: Dr Victor Daniel Elvira (Fermi National Accelerator Laboratory (FNAL))
        Paper
        Slides
    • 15:30
      Tea Break
    • Computing Facilities and Networking: CFN-4 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 194
        apeNEXT: Experiences from Initial Operation
        apeNEXT is the latest generation of massively parallel machines optimized for simulating QCD formulated on a lattice (LQCD). In autumn 2005 the commissioning of several large-scale installations of apeNEXT started, which will provide a total of 15 TFlops of compute power. This fully custom designed computer has been developed by an European collaboration composed of groups from INFN (Italy), DESY (Germany) and CNRS (France). We will give an overview on the system architecture and place particular emphasis to the system software, i.e. the programming environment and the operating system. In this talk we will present and analyze performance numbers and finally report on experiences gained during the first months of machine operation.
        Speaker: Dr Dirk Pleiter (DESY)
        Paper
        Slides
      • 195
        Migration: Surfing on the Wave of Technological Evolution - An ENSTORE Story
        ENSTORE is a very successful petabyte-scale mass storage system developed at Fermilab. Since its inception in the late 1990s, ENSTORE has been serving the Fermilab community, as well as its collaborators, and now holds more than 3 petabytes of data on tape. New data is arriving at an ever increasing rate. One practical issue that we are confronted with is: storage technologies have been evolving at an ever faster pace. New drives and media have been brought to the market constantly with larger capacity, better performance, and lower price. It is not cost effective for a forward looking system to stick with older technologies. In order to keep up with this technological evolution, ENSTORE was in need of a mechanism to migrate data onto newer media. Migrating large quantities of data in a highly available mass storage system does present a technical challenge. An auto-migration scheme was developed in ENSTORE that carries out this task seamlessly, behind the scenes, and without interrupting service nor requiring much operational attention. After two years in service, auto-migration has lived up to its expectation and ENSTORE has gone through several generations of drives and media. In addition, migration can be used in media copying, media consolidation, and data compaction. In this paper, we are going to present the conceptual design of ENSTORE, the issues in data migration in a highly available mass storage system, the implementation of auto-migration in ENSTORE, our experience and extended applications.
        Speaker: Dr Chih-Hao Huang (Fermi National Accelerator Laboratory)
        Paper
      • 196
        T2K LCG Portal
        A working prototype portal for the LHC Computing Grid (LCG) is being customised for use by the T2K 280m Near Detector software group. This portal is capable of submitting jobs to the LCG and retrieving the output on behalf of the user. The T2K specific developement of the portal will create customised submission systems for the suites of production and analysis software being written by the T2K software team. These software suites are computationally intensive, and therefore warrant utilisation of the LCG. The portal runs on an Apache server with the GridSite module. It is accessed over https, identifying users by their Certificate Authority signed Grid certificate. A user can upload files to the portal, as well as edit them, using the GridSite CGI. Proxy certificates are created on a users desktop/laptop machine using a JavaWebStart program that does the equivalent of a voms-proxy-init using the user's Grid certificate, and this limited time proxy is then securely put on the portal. Once there, the proxy is available exclusively to that user to utilise in submiting jobs to the LCG. The portal may also be used as a joint collaborative site for the experiment. GridSite makes it easy to have joint responsibility for maintaining public web pages spread amongst collaboration members. Other collaborative tools such as diaries and lists of publications and submitted abstracts are also easily implementable.
        Speaker: Dr Gidon Moont (GridPP/Imperial)
        Paper
        Slides
      • 197
        UltraLight: A Managed Network Infrastructure for HEP
        We will describe the networking details of NSF-funded UltraLight project and report on its status. The project’s goal is to meet the data-intensive computing challenges of the next generation of particle physics experiments with a comprehensive, network-focused agenda. The UltraLight network is a hybrid packet- and circuit-switched network infrastructure employing both “ultrascale” protocols such as FAST, and the dynamic creation of optical paths for efficient fair sharing on long range networks in the 10 Gbps range. Instead of treating the network traditionally, as a static, unchanging and unmanaged set of inter-computer links, we instead are enabling it as a dynamic, configurable, and closely monitored resource, managed end-to-end, to construct a next-generation global system able to meet the data processing, distribution, access and analysis needs of the high energy physics (HEP) community. To enable this capability as broadly as possible we are working closely with core networks like ESNet, Abilene, Canarie, GEANT; related network efforts like Terapaths, Lambda Station, OSCARs, HOPI, USNet, Gloriad; grid/computing research projects like OSG, GriPhyN, iVDGL, DISUN; and both the US ATLAS and US CMS collaborations.
        Speaker: Shawn Mc Kee (High Energy Physics)
        Paper
        Slides
        UltraLight Homepage
      • 198
        Dynamically Forecasting Network Performance of Bulk Data Transfer Applications using Passive Network Measurements
        High Energy and Nuclear Physics (HENP) experiments generate unprecedented volumes of data which need to be transferred, analyzed and stored. This in turn requires the ability to sustain, over long periods, the transfer of large amounts of data between collaborating sites, with relatively high throughput. Groups such as the Particle Physics Data Grid (PPDG) and Globus are developing and deploying tools to meet these needs. An additional challenge is to predict the network performance (TCP/IP end-to-end throughput and latency) of the bulk data transfer applications (bbftp, ftp, scp, GridFTP etc) without injecting additional test traffic on to the network. These types of forecasts are needed for: making scheduling decisions, data replication, replica selection and to provide quality of service guarantee in the Grid environment. In this paper, we demonstrate with the help of comparisons that active and passive (NetFlow) measurements are highly correlated. Furthermore, we also propose a technique for application performance prediction using passive network monitoring data without requiring invasive network probes. Our analysis is based on passive monitoring data measured at the site border of a major HENP data source (SLAC). We performed active measurements using iperf and passive (NetFlow) measurements on the same data flows for comparison. We also take into account aggregated throughput for applications using multiple parallel streams. Our results show that active and passive throughput calculations are well-correlated. Our proposed approach to predict the performance of bulk-data transfer applications offers accurate and timely results, while eliminating additional invasive network measurements.
        Speaker: Dr Les Cottrell (Stanford Linear Accelerator Center (SLAC))
        Slides
    • Distributed Data Analysis: DDA-4 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 199
        The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures
        The ATLAS strategy follows a service oriented approach to provide Distributed Analysis capabilities to its users. Based on initial experiences with an Analysis service, the ATLAS production system has been evolved to support analysis jobs. As the ATLAS production system is based on several grid flavours (LCG, OSG and Nordugrid), analysis jobs will be supported by specific executors on the different infrastructures (Lexor, CondorG, Panda and Dulcinea). First implementations of some of these executors in the new schema are currently under test, in particular also in the Analysis scenario. While submitting jobs to the overall system will provide seamless access to all ATLAS resources, we also support user analysis by submitting directly to the separate grid infrastructures (Panda at OSG, direct submission to LCG and Nordugrid.) A common job definition system is currently under development that will be supported by all systems. Finally a common user interface project, GANGA, will provide support for the various submission options and will provide a consistent user experience. We will review the status of the different subprojects and report on initial user experiences.
        Speaker: Dr Dietrich Liko (CERN)
        Slides
      • 200
        Prototype of a Parallel Analysis System for CMS using PROOF
        A typical HEP analysis in the LHC experiments involves the processing of data corresponding to several million events, terabytes of information, to be analysed in the last phases. Currently, processing one million events in a single modern workstation takes several hours, thus slowing the analysis cycle. The desirable computing model for a physicist would be closer to a High Performance Computing one where a large number of CPUs are required for short periods (of the order of several minutes). Where CPU farms are available, parallel computing is an obvious solution to this problem. Here we present the tests along this line using a tool for parallel physics analysis in CMS based on the PROOF libraries. Special attention has been paid in the development of this tool to modularity and easiness of usage to enable the possibility of sharing algorithms and simplifying software extensibility while hiding the details of the parallelisation. The first tests performed using a medium size (90 nodes) cluster of dual processor machines on a typical CMS analysis dataset (corresponding to root files for one million top qurk pairs producing fully leptonic final state events distributed uniformly among the computers) show quite promising results on scalability.
        Speaker: Dr Isidro Gonzalez Caballero (Instituto de Fisica de Cantabria (CSIC-UC))
        Paper
        Slides
      • 201
        JobMon: A Secure, Scalable, Interactive Grid Job Monitor
        We present the architecture and implementation of a bi-directional system for monitoring long-running jobs on large computational clusters. JobMon comprises an asyncronous intra-cluster communication server and a Clarens web service on a head node, coupled with a job wrapper for each monitored job to provide monitoring information both periodically and upon request. The Clarens web service provides authentication, encryption and access control for any external interaction with individual job wrappers.
        Speaker: Dr Conrad Steenberg (CALIFORNIA INSTITUTE OF TECHNOLOGY)
        Paper
        Slides
      • 202
        CRAB: a tool to enable CMS Distributed Analysis
        CRAB (Cms Remote Analysis Builder) is a tool, developed by INFN within the CMS collaboration, which provides to physicists the possibility to analyze large amount of data exploiting the huge computing power of grid distributed systems. It's currently used to analyze simulated data needed to prepare the Physics Technical Design Report. Data produced by CMS are distributed among several Computing Centers, and CRAB allows a generic users, without specific knowledge of grid infrastracture, to access and analyze those remote data, hiding the complexity of distributed computational services and making job submission and management as simple as in a local environment. The experience gained during the current CMS distributed data analysis effort is reported, along with CRAB ongoing developments. The interaction of CRAB with the actual and future CMS Data Management services is described, as well as the usage of WLCG/gLite/OSG middleware to provide access to different grid environments. Finally, the use within CRAB of BOSS for logging, bookkeeping and monitoring is presented.
        Speaker: Mr Marco Corvo (Cnaf and Cern)
        Paper
        Slides
    • Distributed Event production and Processing: DEPP-4 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 203
        Data and Computational Grid decoupling in STAR – An Analysis Scenario using SRM Technology
        This paper describes the integration of Storage Resource Management (SRM) technology into the grid-based analysis computing framework of the STAR experiment at RHIC. Users in STAR submit jobs on the grid using the STAR Unified Meta-Scheduler (SUMS) which in turn makes best use of condor-G to send jobs to remote sites. However, the result of each job may be sufficiently large that existing solutions to transfer data back to the initiator site have not proven reliable enough in a user analysis mode or would lock the computing resource (batch slot) while the transfer is in effect. Using existing SRM technology, tailored for optimized and reliable transfer, is the best natural approach for STAR, which is already relying on such technology for massive (bulk) data transfer. When jobs complete the output files are returned to the local site by a 2-step transfer utilizing a Disk Resource Manager (DRM) service running at each site. The first transfer is a local transfer from the worker node (WN) where the job is executed to a DRM cache local to the node, the second transfer is from the WN local DRM cache to the initiator site DRM. The advantages of this method include SRM management of transfers to prevent gatekeeper overload, release of the remote worker node after initiating the second transfer (delegation) so that the computation and data transfer can proceed concurrently, and seamless mass storage access as needed by using a Hierarchical Resource Manager (HRM) to access HPSS.
        Speaker: Dr Eric HJORT (Lawrence Berkeley National Laboratory)
        Paper
        Slides
      • 204
        The LCG based mass production framework of the H1 Experiment
        The H1 Experiment at HERA records electron-proton collisions provided by beam crossings of a frequency of 10 MHz. The detector has about half a million readout channels and the data acquisition allows to log about 25 events per second with a typical size of 100kB. The increased event rates after the upgrade of the HERA accelerator at DESY led to a more demanding usage of computing and storage resources. The analysis of these data requires an increased amount of Monte Carlo events. In order to exploit the new necessary resources, which are becoming available via the Grid, the H1 collaboration has therefore started to install a mass production system based on LCG. The H1 mass production system utilizes Perl and Python scripts on top of the LCG tools to steer and monitor the productions. Jobs and their status are recorded in a MySQL database. During autonomous production a daemon lunches appropriate scripts while a web interface can be used for manual intervention. Additional effort has been put into the sandbox environment in which the executable runs on the worker node. This was necessary to work around present weaknesses in the LCG tools, especially in the area of storage management, and to recover automatically from crashes of the executable. The system has proven to able to track several hundred jobs allowing for production rates of more than one million events per day. At the end of 2005 ten sites in five countries are contributing to the production.
        Speaker: Mr Christoph Wissing (University of Dortmund)
        Paper
        Slides
      • 205
        Experience Supporting the Integration of LHC Experiments Computing Systems with the LCG Middleware
        The LHC Computing Grid Project (LCG) provides and operates the computing support and infrastructure for the LHC experiments. In the present phase, the experiments systems are being commissioned and the LCG Experiment Integration Support team provides support for the integration of the underlying grid middleware with the experiment specific components. The support activity during the experiments’ data and service challenges provides valuable information on the infrastructure performance and deeper understanding of the whole system. Results from the major activities in 2005 are reviewed and a summary of the related activities is given, In addition the support of non High-Energy Physics communities (Biomed, GEANT4 and UNOSAT) is discussed.
        Speaker: Dr Simone Campana (CERN)
        Paper
        Slides
      • 206
        CMS Monte Carlo Production in the Open Science and LHC Computing Grids
        In preparation for the start of the experiment, CMS must produce large quantities of detailed full-detector simulation. In this presentation we will present the experiencing with running official CMS Monte Carlo simulation on distributed computing resources. We will present the implementation used to generate events using the LHC Computing Grid (LCG-2) resources in Europe, as well as the implementation using the Open Science Grid (OSG) resources in the U.S.. Novel approaches have been deployed that make it possible to run the full CMS production chain on distributed computing resources from the generation of events, to the publication of data for analysis, including all the intermediate steps. The CMS transfer system has been coupled to the LCG production chain to make the tools more robust and significantly improving the performance of the production in LCG. The CMS production has been running on LCG-2 and OSG for several months and an analysis of performance and operational experience will be presented.
        Speaker: Dr Pablo Garcia-Abia (CIEMAT)
        Paper
        Slides
      • 207
        Development of the Monte Carlo Production Service for CMS
        The Monte Carlo Processing Service (MCPS) package is a Python based workflow modelling and job creation package used to realise CMS Software workflows and create executable jobs for different environments ranging from local node operation to wide ranging distributed computing platforms. A component based approach to modelling workflows is taken to allow both executable tasks as well as data handling and management tasks to be included within the workflow. Job Creation is controlled so that regardless of the components used, a common self contained job sandbox and execution structure is produced allowing the job to be run on most batch systems via a submission interface. In this presentation we will discuss the architectural choices made in MCPS, the development status, and experiences deploying to both the European and U.S Grid infrastructure.
        Speaker: Dr Peter Elmer (PRINCETON UNIVERSITY)
      • 208
        Italian Tiers hybrid infrastructure for large scale CMS data handling and challenge operations
        The CMS experiment is travelling its path towards the real LHC data handling by building and testing its Computing Model through daily experience on production-quality operations as well as in challenges of increasing complexity. The capability to simultaneously address both these complex tasks on a regional basis - e.g. within INFN - relies on the quality of the developed tools and related know-how, and on their capability to manage switches between testbed-like and production-like infrastructures, to profit from the configuration flexibility of a unique robust data replication system, to adapt to evolving scenarios in distributed data access and analysis. The work done in INFN in the operations of Tier-1 and Tier-2's within event simulation, data distribution and data analysis activities, in daily production-like activities as well as within LCG Service Challenges testing efforts, are here presented and discussed.
        Speaker: Dr Daniele - on behalf of CMS Italy Tier-1 and Tier-2's Bonacorsi (INFN-CNAF Bologna, Italy)
        Paper
        PAPER: Figure-1
        PAPER: Figure-2
        PAPER in doc format
        Slides
    • Event Processing Applications: EPA-4 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 209
        The ALICE Offline framework
        The ALICE Offline framework is now in its 8th year of development and is now close to be used for data taking. This talk will provide a short description of the history of AliRoot and then will describe the latest developments. The newly added alignment framework, based on the ROOT geometrical modeller will be described. The experience with the FLUKA MonteCarlo used for full detector simulation will be reported. AliRoot has also been used extensively for data challenges, and in particular for the parallel and distributed analysis of the generated data. This experience will be described. The talk will also describe the roadmap from now to the initial data taking and the scenario for the usage of AliRoot for early physics at LHC.
        Speaker: Federico Carminati (CERN)
        Slides
      • 210
        The BESIII Offline Software
        The BESIII is a general-purpose experiment for studying electron-positron collision at BEPCII, which is currently under construction at IHEP, Beijing. The BESIII offline software system is built on the Gaudi architecture. This contribution describes the BESIII specific framework implementation for offline data processing and physics analysis. And we will also present the development status of simulation and reconstruction algorithms as well as results obtained from physics performance studies.
        Speaker: Dr Weidong Li (IHEP, Beijing)
        Paper
        Slides
      • 211
        A modular Reconstruction Software Framework for the ILC
        The International Linear Collider project ILC is in a very active R&D phase where currently three different detector concepts are developed in international working groups. In order to investigate and optimize the different detector concepts and their physics potential it is highly desirable to have flexible and easy to use software tools. In this talk we present Marlin, a modular C++ application framework for the ILC. Marlin is based on the international data format LCIO and allows the distributed development of reconstruction and analysis software. Marlin is used throughout the european ILC community to develop reconstruction algorithms for the ILC based on the so called Particle Flow paradigm.
        Speaker: Frank Gaede (DESY)
        Slides
      • 212
        CBM Simulation and Analysis Framework
        The simulation and analysis framework of the CBM collaboration will be presented. CBM (Compressed Baryonic Matter) is an experiment at the future FAIR (Facility for Antiproton and Ion Research) in Darmstadt. The goal of the experiment is to explore the phase diagram of strongly interacting matter in high-energy nucleus-nucleus collisions. The Virtual Monte Carlo concept allows performing simulations using Geant3, Geant4 or Fluka without changing the user code. The same framework is then used for the data analysis. An Oracle database with a build-in versioning management is used to efficiently store the detector geometry, materials and parameters.
        Speaker: Dr Denis Bertini (GSI Darmstadt)
        Paper
        Slides
      • 213
        FLUKA and the Virtual Monte Carlo
        The ALICE Offline Project has developed a virtual interface to the detector transport code called Virtual Monte Carlo. It isolates the user code from changes of the detector simulation package and hence allows a seamless transition from GEANT3 to GEANT4 and FLUKA. Moreover, a new geometrical modeler has been developed in collaboration with the ROOT team, and successfully interfaced to the three programs. This allows the use of one description of geometry, which can be used also during reconstruction and visualization. In this paper we present the present status of the Virtual Monte Carlo and in particular the FLUKA Virtual Monte Carlo implementation, its testing and application for detector response simulation and radiation studies.
        Speaker: Andreas.Morsch@cern.ch Morsch (CERN)
        Slides
      • 214
        CMS Reconstruction Software
        The Reconstruction Software for the CMS detector is designed to serve multiple use cases, from the online triggering of the High Level Trigger to the offline analysis. The software is based on the CMS Framework, and comprises reconstruction modules which can be scheduled independently. These produce and store event data ranging from low-level objects to objects useful for analysis on reduced Data Sets, such as particle identification information or complex decay chains. The performance is presented, together with a roadmap leading up to real LHC physics.
        Speaker: Dr Tommaso Boccali (Scuola Normale Superiore and INFN Pisa)
        Slides
      • 215
        Track reconstruction algorithms for the ALICE High-Level Trigger
        An overview of the online reconstruction algorithms for the ALICE Time Projection Chamber and Inner Tracking System is given. Both the tracking efficiency and the time performance of the algorithms are presented in details. The application of the tracking algorithms in possible high transverse momentum jet and open charm triggers is discussed.
        Speaker: Marian Ivanov (CERN)
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-4 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 216
        The gLite File Transfer Service: Middleware Lessons Learned from the Service Challenges
        In this paper we report on the lessons learned from the Middleware point of view while running the gLite File Transfer Service (FTS) on the LCG Service Challenge 3 setup. The FTS has been designed based on the experience gathered from the Radiant service used in Service Challenge 2, as well as the CMS Phedex transfer service. The first implementation of the FTS was put to use in the beginning of the Summer 2005. We report in detail on the features that have been requested following this initial usage and the needs that the new features address. Most of these have already been implemented or are in the process of being finalized. There has been a need to improve the manageability aspect of the service in terms of supporting site and VO policies. Due to different implementations of specific Storage systems, the choice between 3rd party gsiftp transfers and SRM-copy transfers is nontrivial and was requested as a configurable option for selected transfer channels. The way the proxy certificates are being delegated to the service and are used to perform the transfer, as well as how proxy renewal is done has been completely reworked based on experience. A new interface has been added to enable administrators to perform Channel Management directly by contacting the FTS, without the need to restart the service. Another new interface has been added in order to deliver statistics and reports to the sites and VOs interested in useful monitoring information. This is also presented through a web interface using javascript. Stage pool handling for the FTS is being added in order to allow pre-staging of sources without blocking transfer slots on the source and also to allow the implementation of back-off strategies in case the remote staging areas start to fill up.
        Speaker: Mr Paolo Badino (CERN)
        Paper
        Slides
      • 217
        A Scalable Distributed Data Management System for ATLAS
        The ATLAS detector currently under construction at CERN's Large Hadron Collider presents data handling requirements of an unprecedented scale. From 2008 the ATLAS distributed data management (DDM) system must manage tens of petabytes of event data per year, distributed around the world: the collaboration comprises 1800 physicists participating from more than 150 universities and laboratories in 34 countries. The ATLAS DDM project was established in spring 2005 to develop the system, Don Quijote 2 (DQ2), drawing on operational experience from a previous generation of data management tools. The foremost design objective was to achieve the scalability, robustness and flexibility required to meet the data handling needs of the ATLAS Computing Model, from raw data archiving through global managed production and analysis to individual physics analysis at home institutes. The design layers over a foundation of basic file handling Grid middleware a set of loosely coupled components that provide logical organization at the dataset (hierarchical, versioned file collections) level, supporting in a flexible and scalable way the data aggregations by which data is replicated, discovered and analyzed around the world. A combination of central services, distributed site services and agents handle data transfer, bookkeeping and monitoring. Implementation approaches were carefully chosen to meet performance and robustness requirements. Fast and lightweight REST-style web services knit together components which utilize through standardized interfaces cataloging and file movement tools chosen for their performance and maturity, with the expectation that choices will evolve over time. In this paper we motivate and describe the architecture of the system, its implementation, the current state of its deployment for production and analysis operations throughout ATLAS, and the work remaining to achieve readiness for datataking.
        Speaker: Dr David Cameron (European Organization for Nuclear Research (CERN))
        Slides
      • 218
        Enabling Grid features in dCache
        dCache collaboration actively works on the implementation and improvement of the features and the grid support of dCache storage. It has delivered Storage Resource Managers (SRM) interface, GridFtp server, Resilient Manager and Interactive Web Monitoring tools. SRMs are middleware components whose function is to provide dynamic space allocation and file management of shared storage components on the Grid. SRMs support protocol negotiation and a reliable replication mechanism. The SRM standard allows independent institutions to implement their own SRMs, thus allowing for a uniform access to heterogeneous storage elements. Fermilab has implemented SRM interface v1.1 for dCache and now actively pursues SRM v2.1 implementation. GridFtp is a standard grid data transfer protocol, which supports GSI authentication, control channel encryption, data channel authentication, and extended block mode parallel transfers. We have implemented and continue to improve the GridFtp server in dCache. Resilient Manager is a top-level service in dCache, which is used to create a more reliable and highly available data storage out of the commodity disk nodes, which are part of dCache. New interactive web-based monitoring tools will dramatically improve the understanding of a complex distributed system like dCache by both administrators and users. The monitoring includes a variety of plots and bar diagrams graphically representing the state and history of the dCache system. I will discuss these features and the benefits derived from them to Grid Storage Community.
        Speaker: Timur Perelmutov (FERMI NATIONAL ACCELERATOR LABORATORY)
        Slides
      • 219
        BNL Wide Area Data Transfer for RHIC and ATLAS: Experience and Plan
        We describe two illustrative cases in which Grid middleware (GridFtp, dCache and SRM) was used successfully to transfer hundreds of terabytes of data between BNL and its remote RHIC and ATLAS collaborators. The first case involved PHENIX production data transfers to CCJ, a regional center in Japan, during the 2005 RHIC run. Approximately 270TB of data, representing 6.8 billion polarized proton-protoncollisions, was transferred to CCJ using GridFtp tools. The local network was reconfigured and tuned to route data directly from the online data acquisition system to the BNL public network, thus avoiding the use of tape storage as an intermediate buffer and preserving the scarce resource of tape I/O bandwidth. A transfer speed of 60 MB/s was achieved around the clock, sufficient to keep up with the incoming data stream from the detector. The second case involved transfers between the ATLAS Tier 1 center at BNL and both CERN and the US ATLAS Tier 2 centers, as part of the ATLAS Service Challenge (SC). This demanded even larger data transfer rates, with the goal of validating the current computing model. We were able to demonstrate 150 MB/s wide area data transfer rates using the SC infrastructure with our dCache configuration. We describe the deployment of the major components of this infrastructure, including the ATLAS Distributed Data Management System, File Transfer Service and dCache/SRM and its connection to the mass storage system. The operational model and various monitoring tools are also described. These exercises demonstrated the current level of maturity of Grid tools being used by large physics experiments to satisfy their data distribution requirements. Future work will focus on applying this dCache/SC experience to large scale RHIC data transfers and improving the stability and performance of data transfers as the BNL backbone is upgraded to multiple 10 Gbps bandwidth.
        Speakers: Dr Dantong Yu (BROOKHAVEN NATIONAL LABORATORY), Dr Xin Zhao (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
      • 220
        Project DASH: Securing Direct MySQL Database Access for the Grid
        High energy and nuclear physics applications on computational grids require efficient access to terabytes of data managed in relational databases. Databases also play a critical role in grid middleware: file catalogues, monitoring, etc. Crosscutting the computational grid infrastructure, a hyperinfrastructure of the databases emerges. The Database Access for Secure Hyperinfrastructure (DASH) project develops secure high-performance database access technology for distributed computing. To overcome database access inefficiencies inherent in a traditional middleware approach the DASH project implements secure authorization on the transport level. Pushing the grid authorization into the database engine eliminates the middleware message-level security layer and delivers transport-level efficiency of SSL/TLS protocols for grid applications. The database architecture with embedded grid authorization provides a foundation for secure end-to-end data processing solutions for the experiments. To avoid a brittle, monolithic system DASH uses an aspect-oriented programming approach. By localizing Globus security concerns in a software aspect, DASH achieves a clean separation of Globus Grid Security Infrastructure dependencies from the MySQL server code. During the database server build, the AspectC++ tool automatically generates the transport-level code to support a grid security infrastructure. The DASH proof-of-concept prototype provides Globus grid proxy certificate authorization technologies for MySQL database access control. Direct access to database servers unleashes a broad range of MySQL server functionalities for HENP data processing applications: binary data transport, XA transactions, etc. Prototype servers built with DASH technology are being tested in ANL, BNL, and CERN. To provide on-demand database services capability for Open Science Grid, the Edge Services Framework activity builds the DASH mysql-gsi database server into the virtual machine image, which is dynamically deployed via Globus Virtual Workspaces. The DASH project is funded by the U.S. DOE Small Business Innovative Research Program.
        Speaker: Dr Alexandre Vaniachine (ANL)
        Paper
        Slides
      • 221
        The AMGA Metadata Service in gLite
        We present the AMGA (ARDA Metadata Grid Application) metadata catalog, which is a part of the gLite middleware. AMGA provides a very lightweight metadata service as well as basic database access functionality on the Grid. Following a brief overview of the AMGA design, functionality, implementation and security features, we will show performance comparisons of AMGA with direct database access as well as with other Grid catalog services. Finally the replication features of AMGA are presented and a comparison with proprietary database replication solutions is shown. A series of examples of usage from HEP and other communities is also presented.
        Speaker: Dr Birger Koblitz (CERN)
        Paper
        Slides
    • Online Computing: OC-4 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 222
        Physical Study of the BES3 Trigger System
        Physical study is the base of the hardware designs of the BES3 trigger system. It includes detector simulations, generation and optimization of the sub-detectors’ trigger conditions, main trigger simulations (Combining the trigger conditions from different detectors to find out the trigger efficiencies of the physical events and the rejection factors of the backgrounds events.) and hardware implementation feasibility considerations. The detector simulations are based on the GEANT3 and the others are based on the self-coded software in which the MDC (Main Drift Chamber), EMC (ElectroMagnetic Calorimeter), TOF (Time Of Flight) and main trigger sub-triggers are included. Schemes, procedure and typical results of the trigger system study will be introduced in some detail. From the physical studies, we determined the preliminary trigger tables and get the corresponding trigger efficiencies of the physical events and the rejection factors of the backgrounds events. Also determined are the hardware schemes of the whole trigger system and all of the sub-trigger systems. All are going forward smoothly. A brief introduction of the status of the hardware designs of the trigger system will be presented. Finally, there will be a summary.
        Speaker: Dr Da-Peng JIN (IHEP (Institute of High Energy Physics, Beijing, China))
        Paper
        Slides
      • 223
        A configuration system for the ATLAS trigger
        The ATLAS detector at CERN's LHC will be exposed to proton-proton collisions at a nominal rate of 1 GHz from beams crossing at 40 MHz. A three-level trigger system will select potentially interesting events in order to reduce this rate to about 200 Hz. The first trigger level is implemented in custom-built electronics and firmware, whereas the higher trigger levels are based on software. A system for the configuration of the complete trigger chain is being constructed. It is designed to deliver all relevant configuration information - e.g. physics selection criteria like transverse momentum thresholds encoded in settings for level-1 hardware registers, parameter values of high-level trigger selection algorithms, algorithm versions used or prescale factors. The system should be easy to operate. It must store the history of each trigger setup for later analysis. The same system will be used to configure the offline trigger simulation. In this presentation an overview of the ATLAS trigger configuration system is given including the underlying relational database, and the tools for populating and accessing configuration information.
        Speakers: Hans von der Schmitt (MPI for Physics, Munich), Hans von der Schmitt (ATLAS)
        Slides
      • 224
        Steering the ATLAS High Level Trigger
        This paper descibes an analysis and conceptual design for the steering of the ATLAS High Level Trigger (HLT). The steering is the framework that organises the event selection software. It implements the key event selection strategies of the ATLAS trigger, which are designed to minimise processing time and data transfers: reconstruction within regions of interest, menu-driven selection and fast rejection. This analysis also considers the needs of online trigger operation, and offline data analysis. The design addresses both the static configuration and dynamic steering of event selection. Trigger menus describe the signatures required to accept an event at each stage of processing. The signatures are arranged in chains through these steps to ensure coherent selection. The event processing is broken into a series of sequential steps. At each step the steering will call algorithms needed according to the valid signatures from the previous step, the existing data and the signatures that it should attempt to validate for the next decision. After each step the event can be rejected if it no longer satisfies any signatures. The same steering software runs in both offline and online software environments, so the impact of the HLT on physics analysis can be directly assessed.
        Speaker: Mr Gianluca Comune (Michigan State University)
        Paper
        Slides
      • 225
        ATLAS High Level Trigger Infrastructure, RoI Collection and EventBuilding
        The ATLAS experiment at the LHC will start taking data in 2007. Event data from proton—proton collisions will be selected in a three level trigger system which reduces the initial bunch crossing rate of 40 MHz at its first level trigger (LVL1) to 75 kHz with a fixed latency of 2.5 μs. The second level trigger (LVL2) collects and analyses Regions of Interest (RoI) identified by LVL1 and reduces the event rate further to ~3 kHz with an average latency of tens of ms. Subsequently the EventBuilder collects the data from all readout systems and provides fully assembled events to the the Event Filter (EF), which is the third level trigger. The EF analyzes the entirety of the event data to achieve a further rate reduction to ~200 Hz, with a latency of a few seconds. While LVL1 is based on custom hardware, LVL2, EventBuilder and EF are based on compute farms of O(3000) PCs, interconnected via Gigabit Ethernet, running Linux and multi-threaded software applications implemented in C++. This note focuses on the common design and implementation of the High Level Trigger Infrastructure, RoI Collection and the EventBuilding. Both LVL2 and EF (collectively called High Level Trigger) use online software for the control and data collection aspects, but the actual trigger selection is developed and tested using the offline software environment. A common approach of the LVL2 Processing Unit and the EF Processing Task for the steering, seeding and sequential processing of the selection algorithms has been developed. Significant improvements and generalization of the system design, allow for complex data flow functionality steered by the results of the event selection at LVL2 and EF. This allows the handling of events in parallel data streams for physics, calibration, monitoring or debug purposes. The possibility for event duplication, partial Event Building and data stripping are currently under development. Insight in special features of the system, such as load balancing of the various compute farms, traffic shaping and garbage collection will also be given. The HLT and EventBuilder are being integrated with the LVL1 trigger and the ATLAS subdetectors and will be operated with cosmic events as part of the commissioning in the 2nd half of 2006.
        Speaker: Kostas Kordas (Laboratori Nazionali di Frascati (LNF))
        Slides
    • Software Components and Libraries: SCL-4 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 226
        New Developments of ROOT Mathematical Software Libraries
        LHC experiments obtain needed mathematical and statistical computational methods via the coherent set of C++ libraries provided by the Math work package of the ROOT project. We present recent developments of this work package, formed from the merge of the ROOT and SEAL activities: (1) MathCore, a new core library, has been developed as a self contained component encompassing basic mathematical functionality: special and mathematical functions used in statistics, random number generators, physics vectors, and numerical algorithms such as integration and derivation. (2) The new MathMore library provides a complementary and expanded set of C++ mathematical functions and algorithms, some of them based on existing mathematical libraries such as the GNU Scientific library. Wrappers to this library are written in C++ and integrated in a coherent object oriented framework. (3) The fitting libraries of ROOT have been extended by integrating the new C++ version of MINUIT and adding the linear and robust fitters. We will provide an overview of the ROOT Mathematical libraries and will describe in detail the functionality and design of these packages recently introduced in ROOT. We will also describe the planned improvements and redesign of old ROOT classes to use the new facilities provided by MathCore and MathMore.
        Speaker: Dr Lorenzo Moneta (CERN)
        Paper
        Slides
      • 227
        The Phystat Repository For Physics Statistics Code
        We have initiated a repository of tools, software, and technique documentation for techniques used in HEP and related physics disciplines, which are related to statistics. Fermilab is to assume custodial responsibility for the operation of this Phystat repository, which will be in the nature of an open archival repository. Submissions of appropriate packages, papers, modules and code fragments will be from (and be available to) the HEP community. A preliminary version of this repository is at Phystat.org. Details of the purposes and organization of the repository are presented.
        Speaker: Philippe Canal (FNAL)
        Paper
        Slides
      • 228
        Evaluation of the power of Goodness-of-Fit tests for the comparison of data distributions
        Many Goodness-of-Fit tests have been collected in a new open-source Statistical Toolkit: Chi-squared, Kolmogorov-Smirnov, Goodman, Kuiper, Cramer-von Mises, Anderson-Darling, Tiku, Watson, as well as novel weighted formulations of some tests. None of the Goodness-of-Fit tests included in the toolkit is optimal for any analysis case. Statistics does not provide a universal recipe to identify the most appropriate test to compare two distributions; the limited available guidelines derive from relative power comparisons of samples drawn from smooth theoretical distributions. A comprehensive study has been performed to provide general guidelines for the practical choice of the most suitable Goodness-of-Fit test under general non-parametric conditions. Quantitative comparisons among the two-sample Goodness-of-Fit tests contained in the Goodness-of-Fit Statistical Toolkit are presented. This study is the most complete and general approach so far available to characterize the power of goodness-of-fit tests for the comparison of two data distributions; it provides guidance to the user to identify the most appropriate test for his/her analysis on an objective basis.
        Speakers: Dr Alberto Ribon (CERN), Dr Andreas Pfeiffer (CERN), Dr Barbara Mascialino (INFN Genova), Dr Maria Grazia Pia (INFN GENOVA), Dr Paolo Viarengo (IST Genova)
        Slides
      • 229
        StatPatternRecognition: A C++ Package for Multivariate Classification of High Energy Physics Data
        Modern analysis of high energy physics (HEP) data needs advanced statistical tools to separate signal from background. A C++ package has been implemented to provide such tools for the HEP community. The package includes linear and quadratic discriminant analysis, decision trees, bump hunting (PRIM), boosting (AdaBoost), bagging and random forest algorithms, and interfaces to the feedforward backpropagation neural net and radial basis function neural net implemented in the Stuttgart Neural Network Simulator. Supplemental tools such as random number generators, bootstrap, estimation of data moments, and a test of zero correlation between two variables with a joint elliptical distribution are also provided. Input data can be read from ascii and Root files. The package offers a convenient set of tools for imposing requirements on input data and displaying output. Integrated in the BaBar computing environment, the package maintains a minimal set of BaBar dependencies and can be easily adapted to any other HEP environment. It has been tested at BaBar on several physics-analysis datasets.
        Speakers: Dr Ilya Narsky (California Institute of Technology), Mr Julian Bunn (CALTECH), Dr Julian Bunn (CALTECH), Julian Bunn (California Institute of Technology (CALTECH))
        Paper
        Slides
      • 230
        Operations research and high energy physics
        In the last few decades operations research has made dramatic progress in providing efficient algorithms and fast software implementations to solve practical problems related to a wide range of disciplines, from logistics to finance, from political sciences to digital image analysis. After a brief introduction to the most used techniques, such as linear and mixed-integer programming, I will show how some of these algorithms could find interesting applications in high energy physics, where they could provide alternative solutions to problems related to pattern recognition, track fitting, detector design, detector calibration or detector alignment.
        Speaker: Dr Alberto De Min (Politecnico di Milano)
        Slides
      • 231
        A Kalman Filter for Track-based Alignment
        The Inner Tracker of the CMS experiment consists of approximately 20,000 sensitive modules in order to cope with the bunch crossing rate and the high particle multiplicity expected in the environment of the Large Hadron Collider. For such a big number of modules conventional methods for track-based alignment face serious difficulties because of the large number of alignment parameters and the huge matrices that are involved in the estimation process. For this reason we propose an iterative (track-by-track) method for track-based global alignment. It is derived from the Kalman filter and does not require inversions of large matrices. The update formulas for the alignment parameters and for the associated covariance matrix are presented. We discuss the implementation and the computational complexity and show how to limit the latter to an acceptable level. The performance of the method with respect to precision and speed of convergence is studied in a simplified setup. Scenarios closer to the CMS experimental setup are studied using a first implementation within the CMS reconstruction framework ORCA. Results for the barrel part of the CMS Inner Tracker under these more realistic circumstances are presented.
        Speaker: Edmund Erich Widl (Institute for High Energy Physics, Vienna)
        Paper
        Slides
    • Software Tools and Information Systems: STIS-4 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 232
        Organization and Management of ATLAS Software Releases
        ATLAS is one of the largest collaborations ever attempted in the physical sciences. This paper explains how the software infrastructure is organized to manage collaborative code development by around 200 developers with varying degrees of expertise, situated in 30 different countries. We will describe how succeeding releases of the software are built, validated and subsequently deployed to remote sites. Documentation will also be discussed. Several software management tools have been used, the majority of which are not ATLAS specific; we will show how they have been integrated. ATLAS offline software currently consists of about 2 MSLOC contained in 6800 C++ classes, organized in almost 1000 packages.
        Speaker: Dr Frederick Luehring (Indiana University)
        Paper
        Slides
      • 233
        Physics-level Job Configuration
        The offline and high-level trigger software for the ATLAS experiment has now fully migrated to a scheme which allows large tasks to be broken down into many functionally independent components. These components can focus, for example, on conditions or physics data access, on purely mathematical or combinatorial algorithms or on providing detector-specific geometry and calibration information. In addition to other advantages, the software components can be heavily re-used at different levels (sub-detector tasks, event reconstruction, physics analysis) and on different running conditions (LHC data, trigger regions, cosmics data) with only little adaptations. A default setting therefore has to be provided for each component allowing these adaptations to be made. End-user jobs contain many of these small components, most of which the end-user is totally unaware. There is therefore a big semantic discrepancy between how the end-user thinks about a specific job's configuration and how the configuration is packaged with the individual components making up the job. This paper presents a partly automated system which allows component developers and aggregators to build a configuration ranging over all the above levels, such that e.g. component developers can use a low-level configuration, sub-detector coordinates work with functional sequences and the end user can think in physics processes. This system of python-based job configurations is flexible but easy to keep internally consistent and avoids possible clashes when a component is re-used in a different context. The paper also presents a working system used to configure the new ATLAS track reconstruction software.
        Speaker: Wim Lavrijsen (LBNL)
        Paper
        Slides
      • 234
        The Development and Release Process for the CMS Software Project
        Releasing software for projects with large code bases is a challenging task. When developers are geographically dispersed, often in different time zones, coordination can be difficult. A successful release strategy is therefore paramount and clear guidelines for all the stages of software development are required. The CMS experiment recently started a major refactorization of its simulation, reconstruction and analysis software. At the same time, we revised our software development cycle to improve on release management, build management, distribution management and proper quality assurance via unit, regression and validation tests. In this paper we will report on the lessons learned from our previous experience and on how we are improving in the new project.
        Speaker: Stefano Argiro (European Organization for Nuclear Research (CERN))
        Paper
        Slides
      • 235
        CMS Software Distribution on the LCG and OSG Grids
        Packaging and distribution of experiment-specific software becomes a complicated task when the number of versions and external dependencies increases. With the advent of Grid computing, the distribution and update process must become a simple, robust and transparent step. Furthermore, one must take into account that running a particular application requires setup of the appropriate environment. In addition, the possibility to monitor the status of the experiment software on Grid sites is an important requirement. In this paper we describe the strategy used by CMS to create, distribute, install and monitor the status of the package bundle needed to run production and analysis application on remote sites, with particular emphasis on the approach to Grid computing. We discuss the further steps that are required to make the procedure more robust.
        Speaker: klaus rabbertz (Karlsruhe University)
        Paper
        Slides
      • 236
        CMS Software Packaging and Distribution Tools
        We describe the various tools used by CMS to create and manage the packaging and distribution of software, including the various CMS software packages and the external components upon which CMS software depends. It is crucial to manage the environment to ensure that the configuration is correct, consistent, and reproducible at the many computing centres running CMS software. We describe the tools used to generate distributable software packages, to track the dependencies between packages and their versions, and to manage their distribution and installation on Tier-0, Tier-1, Tier-2 and other computing centres worldwide.
        Speakers: Andreas Nowack (Aaachen University), Klaus Rabbertz (Karlsruhe University)
        Paper
        Slides
    • Plenary: Plenary 5 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Mirco Mazzucato (INFN - Padova)
      • 237
        Advanced networking technologies and HEP in future Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Prof. Harvey Newman (CalTech)
        Slides
      • 238
        The LHC Computing Grid Service Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Les Robertson (CERN)
        Slides
      • 239
        Grids - Collaborations and Gateways Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Ruth Pordes (Fermi National Accelerator Laboratory (FNAL))
        Slides
    • Poster: Poster 2
      • 240
        A Distributed File Catalog based on Database Replication
        The LHC experiments at CERN will collect data at a rate of several petabytes per year and produce several hundred files per second. Data has to be processed and transferred to many tier centres for distributed data analysis in different physics data formats increasing the amount of files to handle. All these files must be accounted for, reliably and securely tracked in a GRID environment, enabling users to analyze subsets of files in a transparent way. The talk describes a distributed file catalogue that gives consideration to the distributed nature of these requirements. In a GRID environment there is on one hand a need for a centralized view of all existing files for job scheduling. On the other hand each site should be able – for performance reasons - to have autonomy to access files without the need of centralized services. The proposed solution meets the need for a local and global operation mode of a file catalogue. Commands can be executed autonomously in a local catalogue branch or heterogeneously in all of them. The catalogue implements a file system like view of a logical name space, user-defined meta data with schema evolution, access control lists and common POSIX user/group file permissions. Architecture, interface functionalities, performance tests and very promising results in comparison to other existing GRID catalogues will be presented.
        Speaker: Andreas Joachim Peters (CERN)
        Paper
      • 241
        A local batch system abstraction layer for global use
        In current, widely deployed management schemes, intensive computing farms are locally managed by batch systems (e.g. Platform LSF, PBS/Torque, BQS, etc.). When approached from the outside, at the global (or 'grid') level, these local resource managers (LRMS) are seen as services providing at least a basic set of job operations, namely submission, status retrieval, cancellation and security credential renewal. The Batch-system Local ASCII Helper Protocol (BLAHP) was designed to offer a simple abstraction layer over the different LRMS, providing uniform access to the underlying computing resources. In order to preserve the simplicity and portability of the scheme and the robustness of the implementation, the functionality in the abstraction had to be carefully limited. In this paper we briefly describe the BLAH protocol and daemon design, focusing on the design and deployment considerations leading to the chosen abstraction. The daemon, originally developed for the EGEE gLite Condor-based Computing Element, is going to be used by Condor also outside the gLite framework. It is also a component of CREAM, the Web Services oriented Computing Element for gLite.
        Speaker: Mr Davide Rebatto (INFN - MILANO)
        Poster
      • 242
        A Parallel Computing Framework and a Modular Collaborative CFD Workbench in JAVA
        This paper addresses the growing usages of high performance computing in modern computational fluid dynamics to simulate the flow-induced vibrations of cylindrical structures necessary to enhance the Reactor Safety in Nuclear plants. The study is essential to prevent the damage of steam tubes causing an accident due to the release of reactor coolant containing radioactive materials out of the reactor system. After the accident of the SG tube rupture due to flow-induced vibration of the Mihama Nuclear Power Station Unit II, preventive measures of the recurrence of a SG tube rupture accident have been adopted. In this area of research, the computational efficiency is a major concern. The aim of this paper is to develop means for writing parallel programs and to transform shared-memory/sequential programs into distributed programs, in an object-oriented environment; thus facilitating programmers to develop parallel CFD codes to solve flow induced vibrations in Nuclear Reactors more quickly and efficiently thus preventing damages and shut-offs as well as accidents. In this approach, the programmer controls the distribution of programs through control and data distribution. The authors have defined and implemented a parallel framework, including the expression of object distributions, and the transformations required to run a parallel program in a distributed environment. The authors provide programmers with a unified way to express parallelism and distribution by the use of classes or packages storing active and passive objects. The distribution of classes/packages leads to the distribution of their elements and therefore to the generation of distributed programs. The authors have developed a full prototype to write parallel programs and to transform those programs into distributed programs with a host of about 12 functions. This prototype has been implemented with the Java language, and does not require any language extension or modification to the standard Java environment. The parallel program is utilized by developing a CFD workbench equipped with high end FEM unstructured mesh generation and flow solving tools enhancing easy analysis of fluid-induced vibrations of circular cylindrical tubes as well as other type of structures with an easy-to-use GUI implemented entirely on the parallel framework.
        Speaker: Mr Sankhadip Sengupta (Undergraduate student,Aerospace Engineering,IIT Kharagpur,Kharagpur,India)
        Paper
      • 243
        An integrated framework for VO-oriented authorization, policy-based management and accounting.
        One of the most interesting challenges of the 'computing Grid' is how to administer grid resources allocation and data access, in order to obtain an effective and optimized computing usage and a secure data access. To reach this goal, a new entity has appeared, the Virtual Organization (VO), which represents a distributed community of users, accessing a distributed computing environment. This new concept has affected all the proposed models for administering authentication, authorization policies and accounting, and the VO name has already become an attribute of the user certificate traveling in the grid. This paper describes the architecture of an integrated framework, based on the Virtual Organization Membership Service (VOMS), the Grid-Policy Box (G-PBox) and the Distributed Grid Accounting System (DGAS), providing respectively authentication, policy-based authorization and credit-based accounting and how they are managed by the VOs. It shows how the VO can build groups, assign roles and associate policies and credits to each group and role, in a dynamic way, and implement the agreements with the resource owners, then a view on how these systems can be integrated into a real grid (gLite/LCG) and how they are used by the Workload Management System(WMS) operating in EGEE is described. This integrated framework shows a VO-based approach to authorization, policy and accounting as an effective and efficient use of the Grid. VO specific use-cases will be described.
        Speaker: Mr Gian Luca Rubini (INFN-CNAF)
        Paper
        Poster
      • 244
        Application of the ATLAS DAQ and monitoring system for MDT and RPC commissioning
        The ATLAS DAQ and monitoring software are currently commonly used to test detectors during the commissioning phase. In this paper, their usage in MDT and RPC commissioning is described, both at the surface pre-commissioning and commissioning stations and in the ATLAS pit. Two main components are heavily used for detector tests. The ROD Crate DAQ software is based on the ATLAS ReadOut application. Based on the plug-in mechanism, it provides a complete environment to interface any kind of detector or trigger electronics to the ATLAS DAQ system. All the possible flavors of this application are used to test and run the MDT and RPC detectors at the pre-commissioning and commissioning sites. Ad-hoc plug-ins have been developed to implement data readout via VME, both with ROD prototypes and emulating final electronics to read out data with temporary solutions, and to provide trigger distribution and busy management in a multi-crate environment. Data driven event building functionality is also used to combine data from different detector technologies. Monitoring software provides a framework for on-line analysis during detector test. Monitoring applications have been developed for noise and cosmic tests and for pulse runs. The PERSINT event display has been interfaced to the monitoring system to provide an on-line event display for cosmic runs in the ATLAS pit.
        Speaker: Dr Enrico Pasqualucci (Istituto Nazionale di Fisica Nucleare (INFN), Roma)
        Paper
        Poster
      • 245
        Application of the Classification Tree technology to automatic pattern recognition in DZero and GLAST
        We have developed a package that trains and applies boosted classification trees, a technology long used by the statistics community, but only recently being explored by HEP. We will discuss its design (Object-Oriented C++), and show two examples of its use: to detect single top production in DZERO events, and for background rejection in GLAST.
        Speaker: Toby Burnett (University of Washington)
      • 246
        b-quark identification at DØ
        DØ, one of the collider detectors at Fermilab's Tevatron, depends on efficient and pure b-quark identification for much of its high-pT physics program. DØ currently has two algorithms, one based on impact parameter and the other on explicit reconstruction of the B hadrons decay vertex. A third, combined algorithm is under development. DØ certifies all of its b-quark tagging algorithms before they can be used in an analysis: this involves determining efficiencies, fake rates, and, most difficultly, systematic errors. Determining these with enough accuracy requires running over millions of events. There is also a ROOT based infrastructure used by the collaboration to run the various algorithms and correctly compute fake rates and expected efficiencies as well as present basic performance plots. We will present an overview of the algorithms and tools, how the efficiencies are calculated, and the design of some of the more complex parts of the system.
        Speaker: Gordon Watts (DZERO Collaboration)
      • 247
        Behind the Scenes Look at Integrity in a Permanent Storage System
        Fermilab provides a primary and tertiary permanent storage facility for its High Energy Physics program and other world wide scientific endeavors. The lifetime of the files in this facility, which are maintained in automated robotic tape libraries, is typically many years. Currently the amount of data in the Fermilab permanent store facility is 3.3 PB and growing rapidly. The Fermilab "enstore" software provides a file system based interface to the permanent store. While access to files through this interface is simple and straightforward, there is a lot that goes on behind the scenes to provide reliable and fast file access, and to insure file integrity and high availability. This paper discusses the measures enstore takes and the administrative steps that are taken to assure users’ files are kept safe, secure, and readily accessible over their long lifetimes. Techniques such as automated write protection, randomized file and tape integrity audits, tape lifetime strategies, and metadata protection are discussed in detail.
        Speaker: Dr Gene Oleynik (Fermilab)
      • 248
        Cluster distributed dynamic storage
        The HEP department of the University of Manchester has purchased a 1000 nodes cluster. The cluster will be accessible to various VOs through EGEE/LCG grid middleware. One of the interesting aspects of the equipment bought is that each node has 2x250 GB disks leading to a total of aproximately 4TB of usable disk space. The space is intended to be managed using dcache and its resilience features. The following describes different dcache configurations and disks and network layout adopted to exploit this space. Different configurations can be used to target different use cases. An alternative method based on http technology will be also described.
        Speaker: Dr Alessandra Forti (University of Manchester)
        Paper
        Poster
      • 249
        CMS Grid Computing in the Spanish Tier-1 and Tier-2 Sites
        CMS has chosen to adopt a distributed model for all computing in order to cope with the requirements on computing and storage resources needed for the processing and analysis of the huge amount of data the experiment will be providing from LHC startup. The architecture is based on a tier-organised structure of computing resources, based on a Tier-0 centre at CERN, a small number of Tier-1 centres for mass data processing, and a relatively large number of Tier-2 centres where physics analysis will be performed. The distributed resources are connected using high-speed networks and are operated by means of Grid toolkits and services. We present in this paper, using the Spanish Tier-1 (PIC) and Tier-2 (federated CIEMAT-IFCA) centres as examples, the organization of the computing resources together with the CMS Grid Services, built on top of generic Grid Services, required to operate the resources and carry out the CMS workflows. We also present the current Grid-related computing activities performed at the CMS computing sites, like high-throughput and reliable data distribution, distributed Monte Carlo production and distributed data analysis, where the spanish Sites have traditionally played a leading role in development, integration and testing.
        Speaker: Dr Jose Hernandez (CIEMAT)
        Paper
        Poster
      • 250
        Contribution of Condor and GLOW to LHC Computing
        The University of Wisconsin campus research computing grid is an offshoot of Condor project, which is providing middle ware for many world-wide computing grids. The Grid Laboratory of Wisconsin (GLOW) and other UW based computing facilities exploit Condor technologies to provide research computing for a variety of fields including high energy physics projects on the UW campus. The Condor/GLOW project provided the largest amount of opportunistic resources to the LHC Monte Carlo simulations, becoming the leading provider of computing cycles to both CMS and ATLAS experiments. Together they have provided over 300 years of CPU in 2005 alone, enabling full simulation of over 20 million events each for CMS, and ATLAS experiments at LHC. The CMS Tier-2 center in the physics department is being built up to exploit the UW campus grid to further enhance the use of opportunistic resources for CMS computing. GLOW also serves chemists, chemical engineers, biologists, medical physicists and astro-physicists. In addition to building and using the UW campus grid, we have also developed inter-campus grid job flocking technologies, which we are using to rapidly aggregate large resources to handle larger than normal peak loads. We have tested these technologies with our connection to the Open Science Grid (OSG) and Harvard University facility called the Crimson-grid. In this paper we will describe the UW campus grid model, the facilities, and its performance. We also discuss the role of CMS Tier-2 computing center at UW.
        Speaker: Prof. Sridhara Dasu (UNIVERSITY OF WISCONSIN)
      • 251
        COOL Performance and Distribution Tests
        In April 2005, the LCG Conditions Database Project delivered the first production release of the COOL software, providing basic functionalities for the handling of conditions data. Since that time, several new production releases have extended the functionalities of the software. As the project is now moving into the deployment phase in Atlas and LHCb, its priorities are the optimization and validation of the software performance and deployment configuration, both locally at Tier0 and in a distributed environment. This poster presentation will review the most important tests which have been performed in this context.
        Speaker: Dr Andrea Valassi (CERN)
        Poster
      • 252
        CRAB Usage and jobs-flow Monitoring
        CMS is one of the four experiments expected to take data at LHC. Order of some PetaBytes of data per year will be stored in several computing sites all over the world. The collaboration has to provide tools for accessing and processing the data in a distribuited environment, using the grid infrastructure. CRAB (Cms Remote Analysis Builder) is a user-friendly tool developed by INFN within the CMS, to help the user to submit a job to the grid, featuring the input data discovery, the Job creation, the job submission, the monitoring of the job status, the output retrieval and, finally, the handling of the user output. In this abstract we describe how we monitor the use of CRAB, by means of an applications which collects the jobs informations from submission time to output retrieval time, and send them to a DataBase. We also present some analysis of the data collected during the second half of 2005, aimed at understanding the performance of the whole system and at finding eventual bottlenecks, in order to understand how to improve the system.
        Speaker: Dr Daniele Spiga (INFN & Università degli Studi di Perugia)
        Paper
        Poster
      • 253
        Deploying an LCG-2 Grid Infrastructure at DESY
        DESY is one of the world-wide leading centers for research with particle accelerators and a center for research with synchrotron light. The hadron-electron collider HERA houses four experiments which are taking data and will be operated until mid 2007. DESY has been operating a LCG-based Grid infrastructure since 2004 which was set up in the context of the EU e-science Project EGEE. The HERA experiments H1 and ZEUS as well as the International Linear Collider (ILC) community have started to massively use the Grid for Monte Carlo production. For those groups, Virtual Organizations (VO) are hosted at DESY, including all necessary Grid services. These global VOs are meanwhile supported by many international LCG sites. DESY currently plans for a participation in one of the LHC experiments ATLAS or CMS. It has been already decided that DESY will become an LCG Tier-2 center for both experiments. The Tier-2 activities will be merged with the Grid infrastructure in operation. In the contribution to CHEP2006 we will give an overview of the Grid infrastructure and discuss deployment and operational aspects. Main emphasis will be put on the details of the set-up of a complete production-grade Grid with all services and its integration into a computer center infrastructure with regard to the Tier-2 plans at DESY.
        Speaker: Dr Andreas Gellrich (for the Grid team at DESY)
      • 254
        Developing a ROOT interface for gLite
        The D-Grid initiative, following similar programs in the USA and the UK, shall help to set up a nationwide German Grid infrastructure. Within work package 3 of the HEP community Grid distributed analysis tools under usage of grid resources shall be developed. A starting point is the analysis framework ROOT. A set of abstract ROOT classes (TGrid ...) provides the user interface to enable Grid access directly from within ROOT. A concrete implementation exists already for the ALICE Grid Environment AliEn. We are developing an interface to the common HEP Grid middleware gLite. This includes querying the gLite File Catalogue, access to individual files, job submission, getting the job status, and retrieving job output. Extensive tests will be done within the gLite testbed of the DECH VO (EGEE). First interactive analysis environments have been set up at single sites using the Parallel ROOT Facility, PROOF. This shall be extended towards several sites using existing Grid Middleware with the aim of a dynamic generation of Grid Analysis Clusters. The current status of the work is presented in the poster.
        Speaker: Dr Kilian Schwarz (GSI)
        Poster
      • 255
        Distributed Analysis Jobs with the ATLAS Production System
        The ATLAS production system provides access to resources across several grid flavors. Based on the experiences from the last data challenge the system has evolved. While key aspect of the old system are kept (Supervisor and executors), new implementations of the components aim for a more stable and scalable operation. An important aspect is also the integration with the new data management system, Don Quijote 2. A graphical user interface supports the interaction of users with the system. The system provides direct support for user analysis jobs, including user job definitions and authenticated access based on globus proxy certificates. An initial version of this system is currently under test on the LCG infrastructure and first user experiences have been collected. Currently the turnaround time for user jobs is under study. Special considerations are given to the operation of the system in parallel to standard ATLAS production. An issue of concern is also the overall scalability of the system. It is planned to provide it as service to all ATLAS user on a short time schedule.
        Speaker: Santiago Gonzalez De La Hoz (European Organization for Nuclear Research (CERN))
        Paper
        Poster
      • 256
        Distributing software applications based on runtime environment
        Packaging and distribution of experiment-specific software becomes a complicated task when the number of versions and external dependencies increases. In order to run a single application, it is often enough to create appropriate runtime environment that ensures availability of required shared objects and data files. The idea of distributing software applications based on runtime environment is employed by Distribution After Release (DAR) tool. DAR allows to automatically replicate application's runtime environment based on the reference software installation. Assuming that software is relocatable, applications can be packaged into a completely self-consistent "darball" and executed on any computing node, which is binary compatible with the reference software installation. Such light-weight distribution can be used on opportunistic GRID resources to avoid excessive efforts of complete installation of experiment-specific software. For over three years, DAR tool has been successfully used by CMS for Monte-Carlo mass production, helping physicists to get results earlier. In version 2, DAR was completely redesigned, optimized, and enriched with new features, ready to meet future challenges. The paper presents general concept of the tool and new features available in DAR 2.
        Speaker: Natalia Ratnikova (FERMILAB)
        Paper
        Poster
      • 257
        Effect of dynamic ACL (access control list) loading on performance of Cisco routers.
        An ACL (access control list) is one of a few tools that network administrators are often using to limit access to various network objects, e.g. restrict access to the certain network areas for specific traffic patterns. The ACLs are also used to control forwarding traffic, e.g. for implementing so-called policy based routing. Nowadays demand is to do update of ACLs dynamically by programmable tools with as low latency as possible. At Fermilab we have about 4 years experience in the area of dynamic reconfiguring network infrastructure. However, dynamic updates are also introduce significant challenge for performance of networking devices. This article will introduce the results of our research and practical experience in dynamic configuring of network infrastructure by using various types of ACLs. The questions that we will try to answer are what is maximum size of ACL, how frequently it can be downloaded without impact on router's CPU utilization and forwarding capabilities, updating of active vs passive ACL, updates of multiple ACLs.
        Speaker: Mr Andrey Bobyshev (FERMILAB)
        Paper
        Poster
      • 258
        Enhancing SSL Performance
        The most commonly deployed library for handling Secure Sockets Layer (SSL) and Transport Layer Security (TLS) is OpenSSL. The library is used by the client to negotiate connections to the server. It also offers features for caching parts of the information that is required, thus speeding up the process and the cost of renegotiation. Those features are generally not used fully. This paper presents results of performance tests, comparing the effects of caching information on the client side. Since OpenSSL and libraries built on OpenSSL (e.g. Globus GT2) are ubiquitous, this work will be of interest to anyone.
        Speaker: Dr Jens Jensen (Rutherford Appleton Laboratory)
        Paper
      • 259
        Exploiting Switched Optical Lightpaths for the ATLAS Experiment
        The ESLEA (Exploitation of Switched Lightpaths for E-science Applications) project has been working to put switched optical lightpath technology to the service of key large scientific projects. Central to the activity is the provision of services to ATLAS experiment. The project is facing the practical problems of finding the best way of interfacing the power (but also the restrictions) of the technology to the ATLAS production and analysis service. The work has been using the UKLight lightpath infrastructure and the international links from UKLight via Netherlight to CERN. The focus is not only on the Tier-0 to Tier-1 raffice but also the considerable traffice between the Tier-1 and Tier-2 sites, and is moving to address the issues of a distributed Tier-2.
        Speakers: Mr Brian Davies (LANCASTER UNIVERSITY), Dr Roger JONES (LANCAS)
      • 260
        Failure Management in the London Distributed Tier 2
        The LCG [1] have adopted a hierarchical Grid computing model which has a Tier 0 centre at CERN, national Tier 1 centres and regional Tier 2 centres. The roles of the different Tier centres are described in the LCG Technical Design Report [2] and the levels of service required from each level of Tier centre is described in the LCG Memorandum of Understanding [3] . Many of the Tier 2 centres are formed by federating the resources belonging to geographically distributed institutes in a given region. The institutes within such a federation are able provide different levels of resources and typically will have different levels of expertise. Providing a good level of service in such situations is challenging. In this context, the London Tier2 (LT2) [4] is one of the four federated Tier 2 centres within the GridPP [5] collaboration in UK. The LT2 is distributed between five institutes in the London area and currently totals around 1 Mega Spec Int 2000 [6] . In this paper we analyze how we can minimize the time to solve LT2 failures within the constraint of the available human resources and their mobility. The analysis takes into account, the time to travel between institutes, the type of problems each support person can solve and their availability. We demonstrate how to create a hierarchy of support staff to solve an identified problem. We also provide an estimate of time to solve for future LT2 failures. This is based on failures rates extracted from the monitoring information and known response times. We suggest this failure management method as a model for any distributed Tier2. [1] LCG http://lcg.web.cern.ch/LCG/ [2] LHC Computing Grid, Technical Design Report, LCG-TDR-001, CERN-LHCC-2005-024. [3] http://lcg.web.cern.ch/LCG/C-RRB/MoU/LCG_T0-2_draft_final_051012.pdf [4] LT2, http://www.gridpp.ac.uk/tier2/london/ [5] GridPP, UK computing for particle physics http://www.gridpp.ac.uk/ [6] Spec Int 2000 http://www.spec.org/cpu2000/
        Speakers: Dr David Colling (Imperial College London), Dr Olivier van der Aa (Imperial College London)
      • 261
        Flexible notification service for Grid monitoring events.
        Monitoring activity plays an essential role in Grid Computing: it deals with the dynamics, variety and geographical distribution of Grid resources in order to measure important parameters and provide relevant information of a Grid system related to aspects such as usage, behaviour and performance. One of the basic requirements for a monitoring service is the capability of detection and notification of fault situations and user-defined events. As regards this aspect, we describe the architecture and implementation of a flexible notification service designed to be incorporated, in a modular way, into a Grid Monitoring tool. A Grid notification service should be able to: receive data from several resources, filter them against a set of users specifications, aggregate and customize filtered results and, finally, deliver them only to interested users. A suitable model is represented by the publish/subscribe system based on event-driven mechanism and useful for distribuited data, regardless the recipients identity or location. In such a model, involved entities are publishers and subscribers that exchange messages trough a broker; messages from publishers are named events, while messages from subscribers are named subscriptions. The broker implements a filter algorithm in order to execute matching between events and subscriptions. Today, the incresing success of XML as a standard for data representation and exchange over the Internet has lead to a consequent increasing interest in filtering and content-based routing of XML data; events are formalized as XML documents, while subscriptions are expressed trough a language able to specify constraints over both events structure and content. After the description of both requirements and architecture, we present a multithread implementation of our Notification Service. We also report on experimental results in the context of the integration of this system with the GridICE Monitoring System, a distributed monitoring tool designed for Grid systems.
        Speaker: Ms Natascia De Bortoli (INFN - Naples)
        Paper
        Poster
      • 262
        GRIDCC-Bringing instrumentation (back) onto the Grid
        While remote control of, and data collection from, instrumentation was part of the initial Grid concept most recent Grid developments have concentrated on the sharing of distributed computational and storage resources. The GRIDCC project is working to bring instrumentation back to the Grid alongside compute and storage resources. To this end we have defined an Instrument Element (IE) as a virtualization of physical instrument. The IE can be regarded as analogous to the Compute Element (CE) and Storage Element (SE) defined in other projects. Much of the power of using instrumentation over a Grid is only realized when it is used in conjunction with existing compute and storage resources. GRIDCC achieves this by building on the gLite infrastructure being developed by the EGEE project. Many features that can be regarded as desirable within other Grid projects become essential when instrumentation is introduced and so must be addressed by the GRIDCC project. These include the ability to work collaboratively with colleagues in remote locations, guaranteed quality of service and advanced reservation of resources. The GRIDCC project will produce a generic solution for instrumentation on a Grid. We have a number of exemplar instrumentation use cases selected to cover the full range of system requirements. We plan to implement our solution on a limited number of these instrument use cases. One such "instrument" is the CMS detector where it is planned to implement the run controller, which is responsible for configuring, controlling and monitoring the experiment, as an IE. A second example is the Synchrotron Radiation Sotorage Ring Elettra where the accelerator will be operated remotely using GRIDCC technology. In this paper we describe the GRIDCC architecture and the progress towards its implementation as the project approaches the midpoint of its three years of funding.
        Speaker: Dr David Colling (Imperial College London)
        Paper
        Poster
      • 263
        High Level Trigger tracking at CMS - b and tau identification
        The CMS detector is a general purpose experiment for the LHC. At the designed maximum luminosity more than 10**9 events/second will be produced, while the data acquisition system will be able to manage 100 Hz bandwidth. The trigger strategy for CMS is organised in 2 steps: a first level hardware trigger is implemented taking advantage of the fast response dectors, as the mu-chambers and the calorimeter systems; the event rate in this step will be reduced by a factor 10**4. The second level step, called the High Level Trigger (HLT), is based only on software algorithms; the aim is to reduce the event rate to 100 Hz. In the HLT process all the CMS subdetectors participate with a very fast reconstruction of the relevant physics objects, thus allowing the possibility toselect a large number of specific final states. In this talk we will concentrate on the performances reached for the lepton-tau and the quark-b identification, that are very important for several future physics studies.
        Speaker: Dr Livio Fano' (INFN - Universita' degli Studi di Perugia)
      • 264
        IGUANA Graphical User Analysis Project: New Developments
        IGUANA is a well-established generic interactive visualisation framework based on a C++ component model and open-source graphics products. We describe developments since the last CHEP, including: the event display toolkit, with examples from CMS and D0; the generic IGUANA visualisation system for GEANT4; integration of ROOT and Hippoplot with IGUANA; and a new lightweight and portable IGUANA Web browser client. Items covered include: the IGUANA design, API and scripting services; the Qt-based graphical user interfaces; OpenInventor/OpenGL 3D and 2D graphics; HEP-specific extensions for tracks, vertices, jets, etc.; vector graphics output; textual, tabular and hierarchical data views; the application control centre; and the novel Asynchronous Javascript/XML (AJAX) Iguana Web client. We demonstrate the use of IGUANA with several applications built for D0 and CMS, including displays of the first real data from the CMS Cosmic Challenge, using the recently re-engineered framework and Event Data Model.
        Speaker: Dr Lucas Taylor (Northeastern University, Boston)
      • 265
        Investigating the behavior of network aware applications with flow- based path selection.
        To satisfy the requirements of US-CMS, D0, CDF, SDSS and other experiments, Fermilab has established an optical path to the StarLight exchange point in Chicago. It gives access to multiple experimental networks, such as UltraScience Net, UltraLight, UKLight, and others, with very high bandwidth capacity but generally sub- production level service. The ongoing LambdaStation project is developing an admission control system for interfacing production mass storage clusters with these experimental networks to enable bulk data movement. The goal is to design a system capable of doing per flow based forwarding. One of the important sidelights of this project is investigation of the behavior of end-node operating systems and applications in the presence of per-flow rerouting. This article will introduce our findings and current status of the research in this area. Our focus is on Linux as operating system, and SRM (Storage Resource Manager), GridFTP, and dCache as network aware applications.
        Speaker: Mr Andrey Bobyshev (FERMILAB)
        Paper
        Poster
      • 266
        Job Efficiencies on the RAL Tier-1 Batch Farm
        In preparation of the Grid for LHC start-up, and as part of the early production service (under the UK GridPP project), we calculate efficiencies for jobs submitted to the RAL Tier-1 Batch Farm. Early usage of the Farm was characterised by high occupancy, but low efficiency of Grid jobs, but improvement has been observed over the last six months. This behaviour has been examined by calculating overall efficiencies, defined as ratios of the total CPU time and the total elapsed wall time. This is done on a monthly basis for each virtual organisation (VO) and for the Farm as a whole. The generation of the statistics is fully automatic and is based on querying job parameters stored in a MySQL database. The data give an overview of how efficiently the Farm is being used, and identify VOs whose efficiency is low. Further information is gained from per-VO scatter plots of CPU time against efficiency for each job. In particular, these plots can identify classes of jobs that terminate due to CPU time and elapsed wall time limits being hit in the batch system. Many factors can lead to low job efficiencies, including local execution problems (e.g., high rates of disk I/O), and Grid-related problems (e.g., transferring remote data). As the efficiency data provide information about job execution on the Farm, they are therefore of use to both site administrators and end users.
        Speaker: Dr Matthew Hodges (RAL - CCLRC)
      • 267
        Large cluster management at a Tier2.
        The HEP department of the University of Manchester has purchased a 1000 nodes cluster. The cluster will be accessible to various VOs through EGEE/LCG grid middleware. In this talk we will describe the management, security and monitoring setup we have chosen for the administration of the cluster with minimum effort and mostly from remote. From remote power up to centralised installation and updates from temperature and current monitoring to networking monitoring.
        Speaker: Mr Colin Morey (University of Manchester)
        Paper
        Poster
      • 268
        LCG and ARC middleware interoperability
        LCG and ARC are two of the major production-ready Grid middleware solutions being used by hundreds of HEP researchers every day. Even though the middlewares are based on same technology, there are substantial architectural and implementational divergencies. An ordinary user faces difficulties trying to cross the boundaries of the two systems: ARC clients so far have not been capable accessing LCG resources and vice versa. After presenting the similarities and differences of the LCG and ARC middlewares, we will focus on the strategies implementing interoperable layers over the two middlewares. The most important areas are the job submission and management and information system components. The basic requirement for the interoperability layer implementation is capability of transparent cross Grid job submission both from ARC and LCG.
        Speaker: Dr Michael Gronager (Copenhagen University)
        Paper
        Poster
      • 269
        LCG-RUS: An Implementation of GGF RUS Enabling Aggregative Accounting for LCG
        The LCG-RUS project implemented the Global Grid Forum's Resource Usage Service standard and made grid resources for LHC accountable in a common schema (GGF-URWG). This project is a part of UK e-Science programme with the purpose of staging grid computing from e-Research to computational market. The LCG-RUS is a complementary work for the precedor MCS (Market for Computational Service) RUS project, which implements the RUS specification in plain Web service. Considering the international characteristic of LCG, LCG-RUS addresses requirements of the grid project as whole, funding bodies, experiments, and users, by combing usage records from all three peer infrastructure (OSG, Nordugrid, LCG/EGEE) and presenting an aggregative view of resource usage for LHC VOs. The current record sources of LCG are mainly from DGAS, SGAS and APEL that provide realtime accounting and accounting after event respectively. A Role-Based Access Control (RBAC) mechnism ensures authorisation for different-level level user agents to access different operations/portTypes. Finally Statistical tools in the client-tier provides aggregative analysis and presents the results in graphical views.
        Speaker: Akram Khan (Brunel University)
      • 270
        LHC-OPN Network at GridKa -- incl. 10Gbit LAN/WAN evaluations
        Besides a brief overview of the GridKa private and public LAN network, the integration into the LHC-OPN network as well as the links to the T2 sites will be presented in the view of the physical network layout as well as there higher protocol layer implementations. Results about the feasibility discussion of dynamical routes for all connections of FZK including all different types the LHC Network concerning links (LightPath, MPLS tunnels, routed IP) will be part of the presentation. A evaluation will show the quality and quantity of the current 10GE link from GridKa to CERN traversing a multi NREN backbone structure via a MPLS tunnel. The evaluation will be contrasted by results of first tests via the LHC-OPN point to point Lightpath GridKa - CERN. The equipment of the first 10GE test setup are based on IBM/Intel HW at GridKa and HP/Chelsio at CERN. The second testbed is more or less symmetrical, on both sites 64bit HP Itanium nodes with Chelsio 10GE NICs. It will be demonstrated a study of the ability of the nodes at Gridka revealing there limitations and the benefit of TOE will be discussed.
        Speaker: Bruno Hoeft (Forschungszentrum Karlsruhe)
        Paper
        Poster
      • 271
        Managing in a Grid Environment – A GridPP perspective
        Based on experiences from the last 18 months of UK Particle Physics Grid (GridPP) operation, this paper examines several key areas for the success of the LHC Computing Grid. Among these is the necessity of establishing useful metrics (from job level to overall operational), accurate monitoring at both the grid and local fabric levels, and mechanisms to rapidly address potentially or actually failing to meet agreed service levels. The paper explains how GridPP is approaching the area of national resource management and usage in the context of an Enabling Grids for E-sciencE (EGEE) Regional Operations Centre (ROC). Operations data is used to indicate how the deployment model and utilisation of resources is changing as experience is gained with the grid middleware and applications. The final part of the paper reviews future deployment planning and explores some of the consequences of the LHC experiment computing models.
        Speaker: Dr Jeremy Coles (GridPP)
        Paper
        Poster
      • 272
        Managing Workflows with ShReek
        The Shahkar Runtime Execution Environment Kit (ShREEK) is a threaded workflow execution tool designed to run and intelligently manage arbitrary task workflows within a batch job. The Kit consists of three main components, an executor that runs tasks, a control point system to allow reordering of the workflow during execution and a thread based pluggable monitoring framework that offers both event driven and periodic monitoring. Developed specifically to address the challenges of running High Energy Physics processing jobs in complex workflow arrangements, with highly varied monitoring needs, the ShREEK toolkit is in use at multiple HEP experiments, and can be adapted for a variety of other uses such as wrapping batch jobs to provide detailed interactive monitoring for administrators and users alike. In this presentation we will discuss the architecture of the ShReek system and the experience using it in several experiment workflows.
        Speaker: Dr David Evans (FERMILAB)
      • 273
        Midrange computing cluster architectures for data analysis in High Energy Physics
        High Energy Physics analysis is often performed on midrange computing clusters (10-50 machines) in relatively small physics groups (3-10 physicists). Such clusters are usually built from commodity equipment and are running under one of several Linux flavors. In an enviornment of limited resources, it is important to choose "right" cluster architecture to achieve maximum performance. We will describe several cluster architectures, show possible drawbacks and how to avoid them.
        Speaker: Mr Andrey Shevel (Petersburg Nuclear Physics Institute (Russia))
        Paper
      • 274
        MonALISA : A Distributed Service for Monitoring, Control and Global Optimization
        The MonaLISA (Monitoring Agents in A Large Integrated Services Architecture) system provides a distributed service for monitoring, control and global optimization of complex grid systems and networks for high energy physics, and many other fields of data-intensive science. It is based on an ensemble of autonomous multi-threaded, agent-based subsystems which are registered as dynamic services and are able to collaborate and cooperate in performing a wide range of monitoring and decision tasks in large scale distributed applications. MonALISA’s services also are able to be discovered and used by other services or clients that require such information. An essential part of managing global-scale systems such as grids is a monitoring system that is able to monitor and track the many site facilities, networks, and tasks in progress, all in real time. The monitoring information gathered also is essential for developing the required higher level services, and components of the Grid system that provide decision support, and some degree of automated decisions, to help maintain and optimize workflow through the Grid. The MonALISA software architecture simplifies the construction, operation and administration of complex systems by: (1) allowing registered services to interact in a dynamic and robust way; (2) allowing the system to adapt when devices or services are added or removed, with no user intervention; (3) providing mechanisms for services to register and describe themselves, so that services can intercommunicate and use other services without prior knowledge of the services' detailed implementation. MonALISA’s flexible access to any monitoring information, and its support for alarm triggers and agents able to take immediate actions in response to abnormal system behavior, are being used to help manage and improve the working efficiency of the site facilities, and the overall Grid system being monitored. These management and global optimization functions are performed by higher level agent-based services. Current applications of MonALISA’s higher level services include optimized dynamic routing of different types of applications, and distributed job scheduling, among a large set of grid facilities. MonALISA is currently used around the clock in several major grid projects, and has proven to be both highly scalable and reliable. More than 250 services are running at sites around the world, collecting information about computing facilities (more than 15 000 nodes), local and wide area network traffic, and the state and progress of the many thousands of grid jobs executing at any one time. It is also used to collect accounting information in grid systems.
        Speaker: Dr Iosif Legrand (CALTECH)
      • 275
        Moving applications in a multi-language environment to 64-bit architectures
        Building a software repository of simulation and reconstruction tools for a future International Linear Collider (ILC) detector we started with applications based on code used in the LEP experiments with Fortran and C as programming languages. All future software development for the ILC is done using modern OO languages, mainly C++ and Java. But for comparisons and providing a smooth transition the old tools are still in use. This report will give on overview on the problems and solutions to adapt the software to 64-bit architectures. Two packages Brahms (a GEANT3 based simulation and reconstruction package) and LCIO (a multi-language interface to a generic data model of basic I/O classes used in the ILC software) are considered in some detail.
        Speaker: Harald Vogt (DESY Zeuthen)
      • 276
        Muon detector calibration in the ATLAS experiment: online data extraction and data distribution
        In the ATLAS experiment, fast calibration of the detector is vital to feed prompt data reconstruction with fresh calibration constants. We present the use case of the muon detector, where an high rate of muon tracks (small data size) is needed to accomplish calibration requirements. The ideal place to get data suitable for muon detector calibration is the second level trigger, where the pre-selection of data by the first level trigger allows to select all and only the hits from isolated muon tracks and to add useful information to seed the calibration procedures. The online data collection model for calibration data is designed to minimize the use of additional resources, without affecting the behaviour of the trigger/DAQ system. Collected data are then streamed to remote Tier 2 farms dedicated to detector calibration. Measurements on the pre-series of the ATLAS TDAQ infrastructure and on the standard LHC data distribution path are shown, proving the feasibility of the system.
        Speaker: Dr Enrico Pasqualucci (Istituto Nazionale di Fisica Nucleare (INFN), Roma)
        Paper
        Poster
      • 277
        New Geant4 physics processes for simulation at the electronvolt scale
        The extension of Geant4 simulation capabilities down to the electronvolt scale is required for precision studies of radiation effects on electronics and detector components, and for micro-/nano-dosimetry studies in various experimental environments. A project is in progress to extend the coverage of Geant4 physics to this energy range. The complexity of the problem domain is discussed - such as, for instance, the fact that in this energy range process models are material-dependent. The physics models of new processes implemented in the Geant4 Toolkit are presented: elastic scattering, charge increase/decrease, excitation, ionisation. A new approach to policy-based process design in Geant4 is described. Results concerning various processes at the electronvolt scale are presented.
        Speakers: Dr Barbara Mascialino (INFN Genova), Prof. Gerard Montarou (Univ. Blaise Pascal Clermont-Ferrand), Dr Maria Grazia Pia (INFN GENOVA), Dr Petteri Nieminen (ESA), Prof. Philippe Moretto (CENBG), Dr Riccardo Capra (INFN Genova), Dr Sebastien Incerti (CENBG), Ziad Francis (Univ. Blaise Pascal Clermont-Ferrand)
      • 278
        On Demand, Policy Based Monte Carlo Production and Tracking, Leveraging Clarens, MonALISA and RunJob
        Abstract: We describe a set of Web Services, created to support scientists in performing distributed production tasks (e.g. Monte Carlo). The Web Services described in this paper provide a portal for scientists to execute different production workflows which can consist of many consecutive steps. The main design goal of the Web Services discussed is to provide controlled access for (multiple) set(s) of users in different roles (e.g. scientists, administrators, grid operators,…) to complex production workflows without the added trouble of updating, configuring, and patching these ever evolving applications and keep the users focused on their core tasks (running production), while experts at the Tier2 centers keep the software up to date. Once users execute a workflow they recieve a tracking number that is used to track the job status which is propagated through MonALISA. Job anomalies can be further investigated using the JobMon service. The Web Services have been implemented inside the Clarens Web Service framework. This Python (and Java) based framework provides, amongst others, x509 authorization, access control and VO management for its services. The Web Services discussed in this paper re-use several of these Clarens components in providing access control and usage quotas. Initially the services described in this paper where developed to support users in Monte Carlo production activities, however due to their generic design, can be used to expose other (potentially complex) applications to users as will be shown in this paper.
        Speaker: Dr Frank van Lingen (CALIFORNIA INSTITUTE OF TECHNOLOGY)
        Paper
        Poster
      • 279
        Optimized access to distributed relational database system
        Efficient and friendly access to the large amount of data distributed over the wide area network is a challenge for the near future LCG experiments. The problem can be solved using current standard open technologies and tools. A JDBC standard solution has been chosen as a base for a comprehensive system for the relational data access and management. Widely available open tools have been reused and extended to satisfy HEP needs. - An SQL backend has been implemented for the Abstract Interface for Data Analysis (AIDA), making relational data available via standard analysis API. Interfaces to several languages, plugins to Java Analysis Studio as well as Web Service access are available and interfaced with Atlas Event Metadata (Tag) database. - Clustered JDBC (C-JDBC) from ObjectWeb Consortium has been reused to enable transparent and optimized access to distributed relational database. Two extensions have been developed: - Query splitter, which sends a user query to a set of complementary databases and delivers the merged result back to the user. - Predictive Query engine, which uses monitoring and caching information to deliver fast approximative answer to time-consuming SQL queries. - Octopus replication tool from ObjectWeb Consortium has been extended with specific requirements of the Atlas experiment and used to replicate Atlas data over heterogeneous database network.
        Speaker: Dr Julius Hrivnac (LAL)
        Paper
        Poster
      • 280
        PLAT – PBS Log Analysis Tools
        Many computing farms use as a local batch system management PBSPro or its free version OpenPBS, respectively Torque and Maui products. These packages are delivered with graphical tools for a status overview, but summary and detailed reports from accounting log files are not available. This poster describes set of tools we are using for an overview of resources consumption in a last few hours and days. Tools can be run regularly to monitor finished jobs. They are able to send an alarm if some condition appears – typically a large amount of very short jobs on a worker node is a sign of a misconfigured node. No database is needed to run these tools. This simplifies an installation and usage of PLAT, but limits its use to statistics from tens of thousands of jobs.
        Speaker: Dr Jiri Chudoba (Institute of Physics, Prague)
      • 281
        POOL Developments for Object Persistency into Relational Databases
        The LCG POOL project has recently moved its focus on the developments of storage back-ends based on Relational Databases. Following the requirements of the LHC experiments, POOL has developed a framework for object persistency into relational schemas. This presentation will describe the main functionality of the package, explaining how the mechanism provided by POOL allows to efficiently store and retrieve arbitrary user objects. We will then present a few examples of integration of the software in the data production system of LHC experiments, discussing in particular the aspects related to the deployment.
        Speaker: Dr Giacomo Govi (CERN)
        Paper
        Poster
      • 282
        Prototyping production and analysis frameworks for LHC experiments based on LCG/EGEE/INFN-Grid middleware
        The production and analysis frameworks for LHC experiments are demanding advanced features in the middleware functionality and a complete integration with the experiment specific software environment. They also require an effective and distributed test platform where the integrated middleware functionality is verified and certified. The deployment in a production infrastructure of such solutions imposes as well measured performance, documentation and support for correct configuration and operation. In this paper we describe the solutions, tests and results achieved in collaboration with ALICE, ATLAS, CDF, CMS and LHCb experiments to fulfill their production and analysis requirements. The middleware components considered have been the workload management system with the additional bulk submission feature, the replica catalog interface, the Virtual Organization Management System (VOMS) role definition and its integration with the policy system, job and application monitoring.
        Speaker: Maria Cristina Vistoli (Istituto Nazionale di Fisica Nucleare (INFN))
        Paper
        Poster
      • 283
        Role of Offline Software in ATLAS Detector Commissioning
        Commissioning of the ATLAS detector at the CERN Large Hadron Collider (LHC) includes, as partially overlapping phases, subsystem standalone work, integration of systems into the full detector, cosmics data taking, single beam running and finally first collisions. These tasks require services like DAQ with data recording to Tier0 and distributed data management, databases, histogramming and event displays, data reconstruction and analysis both online and offline. An important aspect is the early interplay of the offline, high-level trigger, and online software versions involved. This paper describes how the various ATLAS components are utilized from now until LHC operation.
        Speakers: Hans von der Schmitt (MPI for Physics, Munich), Rob McPherson (University of Victoria, TRIUMF)
        Poster
      • 284
        Schema Independent Application Server Development Paradigm
        The idea of an application database server is not new. It is a key element in multi-tiered architectures and business application frameworks. We present here a paradigm of developing such an application server in a complete schema independent way. We introduce a Generic Query Object Layer (QOL) and set of Database/Query Objects (D/QO) as the key component of the multi-layer Application server along with set of tools for generating such objects. In Query Object Layer each database table is represented as a C++ Object (Database Object) and structured complex queries spanning multiple tables are written into Object Representations, calling them Query Objects. All database operations (select/insert etc) are performed via these Objects. In general, developments of such servers tend to pre-identify interesting join conditions and hardwire queries for such Query Objects, for the ease of development. We have tried to enhance this concept by generalizing creation of such Query Objects based on existing/defined relations among the tables involved in the join, like foreign key relations, and any other user-defined join-condition. Also delaying and generalizing creation of actual SQL Query till the execution time. This is an enormously complex task, joins with cyclic conditions and multi-relations going to same table are hard to convert into Query Objects. The task is divided into three major components. A SQL Parser that reads-in Table definitions and create C++ Objects (Database Objects). A Query Object View Creator that generates Query Object according to existing and user-defined join conditions for multiple tables. And Object Layer Algorithms that are generic enough to deal with any Dataset or Query Object. In addition to this the whole fabric of Application server is tied by exchanging self describing objects that do not need any changes in case of a schema change. The Business Logic Layer can be quickly built for know set of operations, written as "Managers" and Client interface is done through data structures that can also be semi-generated through SQL Parser. The process of adapting the system for a new schema is very fast. The maintenance over head is also very low.
        Speaker: Anzar Afaq (FERMILAB)
        Paper
        Poster
      • 285
        ScratchFS: A File System To Manage Scratch Disk Space For Grid Jobs.
        Managing the temporal disk space used by jobs in a farm can be an operational issue. Efforts have been put on controlling this space by the batch scheduler to make sure the job will use at most the requested amount of space, and that this space is cleaned up after the end of the job. ScratchFS is a virtual file system that addresses this problem for grid as well as conventional jobs at the file system level. Designed to be on top of a real file system, the virtualization layer is responsible for controlling the access to the underlying file system. For conventional (i.e. local) jobs, the access control to the disk space is based on standard Unix file permissions. In the case of grid jobs, ScratchFS uses only the grid credentials of the job to grant or deny access to the scratch disk space in a fully transparent way from the application point of view. Therefore, several grid jobs submitted by different individuals can be executed in the same worker node, even if their grid identities are all mapped to the same local user identifier, while preserving the confidentiality of each job scratch space. Designed to be extensible, it can be configured to collect information about the file system usage and to enforce quotas, among other things. In addition, a simplified interface is provided to ease the integration with a batch scheduler. In this paper we present how this virtualization layer is used by our local batch scheduler and we explain the implementation of the system in detail, along with some comments on its security and benchmark results compared to a real file system.
        Speaker: Mr Leandro Franco (IN2P3/CNRS Computing Centre)
        Paper
        Poster
      • 286
        Self-organized maps for tagging b jets associated with heavy neutral MSSM Higgs bosons
        B tagging is an important tool for separating the LHC Higgs events with associated b jets from the Drell-Yan background. We extend standard neural network (NN) approach using multilayer perceptron in b tagging [1] to include self-organizing feature maps. We demonstrate the use of the self-organizing maps (SOM_PAK program package) and the learning vector quantization (LVQ_PAK). A background discriminating power of these NN tools are compared with standard tagging algorithms. [1] A. Heikkinen and S. Lehti, Tagging b jets associated with heavy neutral MSSM Higgs bosons. Proceedings of ACAT 2005, May 22 - 27, DESY, Zeuthen, Germany.
        Speaker: Mr Aatos Heikkinen (Helsinki Institute of Physics)
        image
        image
        image
        image
        image
        LaTeX file
        Paper
      • 287
        SPHINX: Experimental Evaluation on Open Science Grid
        Grid computing is becoming a popular way of providing high performance computing for many data intensive, scientific applications. The execution of user applications must simultaneously satisfy both job execution constraints and system usage policies. The SPHINX middleware addresses both these issues. In this paper, we present performance results of SPHINX on Open Science Grid. The simulation and execution results show that we can reduce the completion time of workflows.
        Speaker: Sanjay Ranka (University of Florida)
      • 288
        SQLBuilder, a metadata language to SQL translator, an overview of its input language and internal structure
        (For the SAMGrid Team) SQLBuilder's purpose is to translate selection criteria in a high-level form to SQL query statements. The internal design is intended to permit easy changes to the selection criteria available and to permit retargeting the specific dialect of SQL generated. The initial target language will be Oracle 9i SQL. The input language will be defined in a formal grammar and internal structure will use compiler construction technology. The initial queries will return sets of file identifiers. The selection criteria high-level format is named, parameterized, possibly with multiple parameters tests on the metadata combined by boolean operators and intermediate sets may be combined by set operators. There will a capacity to include previous named criteria with a check for infinite inclusion recursion.
        Speaker: Mr Randolph J. Herber (FNAL)
        Paper
      • 289
        StoRM, an SRM implementation for LHC analysis farms
        LHC analysis farms - present at sites collaborating with LHC experiments - have been used in the past for analyzing data coming from an experiment’s production center. With time such facilities were provided with high performance storage solutions in order to respond to the demand for big capacity and fast processing capabilities. Today, Storage Area Network solutions are commonly deployed at LHC centers, and parallel file systems such as IBM/GPFS and HP/Lustre allow for reliable, high-speed native POSIX I/O operations. With the advent of Grid technologies, existing LHC analysis facilities have to face the problem of adapting current installations with Grid requirements to allow users to run their applications both locally and from the Grid in order to provide efficient usage of the resources. The Storage Resource Manager (SRM) protocol has been designed to provide a standard uniform interface to storage resources for both disk and tape based storage systems. As of today SRM implementations exist for storage managers such as Castor, d-Cache and LCG DPM. However, such solutions manage the entire storage space allocated to them and force applications to use custom file access protocols such as rfio and d-cap, sometimes penalizing performance and requiring changes in the application. StoRM is a disk-based storage resource manager that implements SRM v.2.1.1. It is designed to work over native parallel filesystems, provides for space reservation capabilities and uses native high performing POSIX I/O calls for file access. StoRM takes advantage of special features provided by the underlying filesystem like ACL support and file system block pre-allocation. In this article, we describe the status of the StoRM project and the features provided by the current release. Permission management functions are based on the Virtual Organization Management System and on the Grid Policy Service. StoRM caters for the interests of the economics and finance sectors since security is an important driving feature. We report on the tests performed on a dedicated test bed to prove basic functionality and scalability of the system together with interoperability with other existing SRM implementations.
        Speakers: Luca Magnoni (INFN - CNAF), Riccardo Zappi (INFN - CNAF)
        Paper
        Poster
      • 290
        Techniques for high-throughput, reliable transfer systems: break-down of PhEDEx design
        Distributed data management at LHC scales is a staggering task, accompanied by equally challenging practical management issues with storage systems and wide-area networks. CMS data transfer management system, PhEDEx, is designed to handle this task with minimum operator effort, automating the workflows from large scale distribution of HEP experiment datasets down to reliable and scalable transfers of individual files over frequently unreliable infrastructure. PhEDEx has been designed and proven to scale beyond the current CMS needs. Few of the techniques we have used are novel, but rarely documented in HEP. We describe many of the techniques we have used to make the system robust and able to deliver high performance. On schema and data organisation we describe our use of hierarchical data organisation, separation of active and inactive data, and tuning the database for the data and access patterns. Regarding monitoring we describe our use of optimised queries, moving queries away from hot tables, and using multi-level performance histograms to precalculate partial aggregated results. Robustness applies to both detecting and recovering from local errors, and robustness in the distributed environment. We describe the coding patterns we use for error-resilient and selfhealing agents for the former, and the breakdown of handshakes in file transfer, routing files to destinations, and in managing site presence for the latter.
        Speaker: Timothy Adam Barrass (University of Bristol)
        Paper
        Poster
      • 291
        Teraport: A Grid Enabled Platform with Optical Connectivity
        The purpose of the Teraport project is to provide computing and network infrastructure for a university-based, multi-disciplinary, Grid-enabled analysis platform with superior network connectivity to both domestic and international networks. The facility is configured and managed as part of larger Grid infrastructures, with specific focus on integration and interoperability with the TeraGrid and Open Science Grid (OSG) fabrics. The cluster consists of 122 compute nodes with dual 2.2 GHz 64-bit AMD/Opteron processors (244 total), 15 infrastructure nodes, and 11 TB of fiber channel RAID, all connected with Gigabit Ethernet. The software environment, SuSE Linux Enterprise Server with the high performance Global Parallel File System (GPFS), was chosen for cost and proven benchmarks of the Opteron platform with IBM software components. As part of the project, the optical path (dense wavelength division multiplexing equipment and routers) was upgraded, providing 10 Gbps connectivity to Starlight in Chicago.
        Speaker: Robert GARDNER (UNIVERSITY OF CHICAGO)
      • 292
        Testing against DAG RB
        A Directed Acyclic Graph (DAG) can be used to represent a set of programs where the input, output or execution of one or more programs is dependent on one or more other programs. We developed a basic test suite for DAG jobs. It consists of 2 main parts: a) functionality tests are using of CLI (in Perl). The generation of the DAG with arbitrary structure and different JDL-attributes for the DAG sub-jobs. It was developed the tools that allows to realize the following: - create the DAG with regular structure like tree, based on the template that defines the Executable and JDL-attributes for three parts of DAG (pre-jobs, main part and post-jobs) with given number of the levels and nodes - modify the created main JDL file: add new dependencies or delete the existed ones, based on the appropriate template - add new JDL attributes for the DAG at whole or for one level or for the definite set of nodes according to another template b) stress tests (reliability and stability) are using of API (in Java). The numbers of nodes is varying from 1st to 4nd level layers. Test contains a job chains of real CMS tasks, including OSCAR simulation jobs, ORCA reconstruction jobs and special analysis jobs to analyse the final data produced through this system. This test system has been implemented as a standalone kit to check the LCG environment and queue management system. CMS software framework should be pre-installed to be able to run this test system in a most efficient way.
        Speaker: Elena Slabospitskaya (State Res.Center of Russian Feder. Inst.f.High Energy Phys. (IFVE))
      • 293
        The ALICE distributed computing framework
        The ALICE Computing Team has developed since 2001 a distributed computing environment implementing a Grid paradigm under the name of AliEn. With the evolution of the middleware provided by various large grid projects in Europe and in the US (EGEE, OSG, ARC), a number of services provided by AliEn are now provided and maintained by the corresponding Grid infrastructures. AliEn has therefore evolved from a vertically integrated Grid solution to a set of interfaces to common services offering to the ALICE users a seamless interface to the available Grid services, and a set of high-level services and functions not yet available from standard middleware. This hybrid setup has been thoroughly tested during the so-called data challenges and will be used for the processing and analysis of ALICE data. This talk will describe the present architecture of the AliEn system, the experience derived from its usage during the ALICE Data Challenges, and the plans and perspectives for its evolution.
        Speaker: Predrag Buncic (CERN)
      • 294
        The ATLAS Trigger Muon "Vertical Slice"
        The ATLAS experiment at the LHC proton-proton collider at CERN will be faced with several technological challenges. A three level trigger and data acquisition system has been designed to reduce the 40 MHz bunch-crossing frequency, corresponding to an interaction rate of 1GHz at the design instantaneous luminosity to about ~100 Hz allowed by the permanent storage system. The capability to select events with muons at an early stage of the trigger system is therefore crucial to cope with the expected rates. In this paper we will describe the whole trigger and data aquisition system of the muon system (Muon Trigger ``Vertical Slice''). The first level of trigger (LVL1) is implemented on hardware, it selects high-pt muons with transverse momentum above programmable thresholds with a coarse evaluation of the eta and phi coordinate (the so called RoI, the "Region of Interest") using hits coming from the trigger chambers of the Muon Spectrometer (MS), the rate is reduced to ~75-100 kHz. The RoI are then passed to the second trigger lever (LVL2) implemented on an on-line software architecture. The muFast algorithm reconstructs muons with transverse momentum larger than ~6 GeV combining full granularity information inside RoIs from trigger and precision chambers of the MS. Other algorithms will then combine outputs coming from different ATLAS sub-detectors to further select muons with different topologies. The rate is reduced to 1 kHz with a mean processing time of 10ms. A third trigger level, the Event Filter (EF) will access the full event to reduce the rate. Different algorithms will be implemented reconstructing events inside the MS and combining the measurements of all ATLAS sub-detectors in order to provide the best estimate of their momentum at the production vertex. Along with the algorithm implementation and description we will also present the expected performances relative to signal efficiencies, background rejection and execution time.
        Speaker: Dr Antonio Sidoti (INFN Roma1 and University "La Sapienza")
        Paper
      • 295
        The CMS electromagnetic calorimeter reconstruction software: requirements from physics and design aspects
        The design goal of the CMS electromagnetic calorimeter is to reach an excellent energy resolution; several aspects concur to the fulfillment of this ambitious goal. An enormous quantity of hardware monitoring data will be available, together with a laser monitoring system that will be able to follow quasi on-line the change of transparency of the crystals due to radiation damage. This result in a big amount of data that needs to be stored in a transparent way in the on-line condition database by on-line monitoring services; part of them needs also to be replicated in the off-line database. Stringent requirements are then made to the database system, in particular for what concern scalability, fast and flexible access and replication of data. Another crucial aspect is represented by the in situ calibration techniques: selected and reduced data needs to be transferred to the calibration farm soon after the data taking, while, afterwards, the possibility of a fast reprocessing of data, when new calibration will be made available, should be envisaged. All these aspects are taken into account in the current design of both condition databases and object-oriented reconstruction software: the design schema will be described, together with the expected flow of information, from raw to reprocessed data.
        Speaker: Dr Paolo Meridiani (INFN Sezione di Roma 1)
        Poster
      • 296
        The Goodness-of-Fit Statistical Toolkit
        Statistical methods play a significant role throughout the life-cycle of high energy physics experiments. Only a few basic tools for statistical analysis were available in the public domain FORTRAN libraries for high energy physics. Nowadays the situation is hardly unchanged even among the libraries of the new generation. The present project in progress develops an object-oriented software toolkit for statistical data analysis. The Goodness-of-Fit (GoF) Statistical Comparison component of the toolkit provides algorithms for the comparison of data distributions in a variety of use cases typical of physics experiments. The GoF Statistical Toolkit is an easy to use, up-to-date and versatile tool for data comparison in physics analysis. It is the first statistical software system providing such a variety of sophisticated and powerful algorithms in high energy physics. The component-based design uses object-oriented techniques together with generic programming. The adoption of AIDA for the user layer decouples the usage of the GoF Toolkit from any concrete analysis system the user may have adopted in his/her analysis. A layer for user input from ROOT objects has been easily added recently, thanks to the component-based architecture. The system contains a variety of two-sample GoF tests, from chi-squared to tests based on the maximum distance between the two empirical distribution functions (Kolmogorov-Smirnov, Kuiper, Goodman), to tests based on the weighted quadratic distance between the two empirical distribution functions (Cramer-von Mises, Anderson-Darling). Thanks to its flexible design the GoF Statistical Toolkit has been recently extended, implementing other less known GoF tests (weighted formulations of Kolmogorov-Smirnov and Cramer-von Mises tests, Watson and Tiku tests). Nowadays the GoF Statistical Toolkit represents the most complete system available for two-sample GoF hypothesis testing, not only in the domain of physics, but even in professional statistics analysis. The toolkit is open-source and can be downloaded from the web together with user and software process documentation. It is also distributed together with the LCG Mathematical Libraries. We present the recent improvements and extensions of the GoF Statistical Toolkit; we describe the architecture of the extended system, the new statistics methods implemented, some results of its application, and an outlook towards future developments.
        Speakers: Dr Alberto Ribon (CERN), Dr Andreas Pfeiffer (CERN), Dr Barbara Mascialino (INFN Genova), Dr Maria Grazia Pia (INFN GENOVA), Dr Paolo Viarengo (IST Genova)
      • 297
        The GridPP metadata group: metadata in a Grid era.
        Continuing the UK's strong involvement with Grid computing, the GridPP2 project (2004--2007) has established a group to investigate the use of metadata within HEP Grid computing. Three posts (based at Glasgow) are dedicated to metadata, but the group includes others working for CERN, various LHC experiments, EGEE and further afield. An important aspect of the group's work is to provide a forum in which all those working on metadata services can be informed of each others work within this area. To this end, the group has various collaborative resources, monthly meetings and annual workshops. One of the first tasks undertaken by the group was reviewing a disparate collection of general HEP use-cases documents, identifying the corresponding requirements on any metadata service and consolidate these as metadata use-cases. This work is available as the paper "unlucky for some-the thirteen core use cases of HEP metadata." Storing metadata is a common issue amongst projects. Projects have proposed different solutions, based on various relational database schemata. The metadata group reviewed these schemata and a paper summarising the results was published. With any metadata service, one must monitor available servers, detect clients that cause surges in demand and allow remote experts to diagnose problems. The group has produced a requirements document for such monitoring and is helping implement a metadata monitoring system. In collaboration with LCG and EGEE, the metadata group helped establish a common interface for querying a metadata service. This interface has been adopted by several metadata servers. Work is ongoing in "Gridifying" existing tools, for example, by adding an authentication and authorisation framework with the AMI metadata server by integrating work from the VOMS project. We will present a summary of the work achieved to date and detail our plans for the future.
        Speaker: Dr Paul Millar (GridPP)
      • 298
        The initialization and control system of the ATLAS Level-1 Muon Barrel Trigger System
        The ATLAS Level-1 Barrel system is devoted to identify muons crossing the two outer Resistive Plate Chambers stations of the Barrel spectrometer, passing a set of programmable pT thresholds, to find their position with a granularity of Delta EtaX Delta Phi=0.1X0.1, and to associate them to a specific bunch crossing number. The system sends this trigger information to the Central Trigger Processor within a fixed latency of about one microsecond. The system is also responsible to record Resistive Plate Detector hits, for monitoring purposes and to provide muon track position with a spatial resolution of about 1 cm to higher level triggers and to refine precision chambers' data. The system is hardware based but its high level of programmability and its deployment in the cavern poses many requirements to its control data path. The control system has to be reliable, it has to survive to an amount of radiation comparable to space equipment, and has to connect to about 800 on-detector processor destination boxes spread over a large area. The choice which has been implemented uses a CANbus based system, with microcontroller-based destination nodes taking care of initialize and control local devices accessible via JTAG, I2C and SPI local buses. Sixty-four CAN chains of up to 16 nodes make up the LVL1 Barrel control system. Configuration data is stored locally for fast initialization running in parallel over all nodes. Linux-based PCs, integrated in the ATLAS TDAQ system, run initialization and control applications reading configuration data from an Oracle database. A detailed description of the hardware and software organization of the system is done, with results from the setups used for chamber commissioning.
        Speaker: Stefano Veneziano (Istituto Nazionale di Fisica Nucleare Sezione di Roma 1)
        Minutes
        Paper
      • 299
        The Midwest U.S. ATLAS Tier2 Facility
        The Midwest U.S. ATLAS Tier2 facility being deployed jointly by the University of Chicago and Indiana University is described in terms of a set of functional capabilities and opertional provisions in support of ATLAS managed Monte Carlo production and distributed analysis of datasets by individual physicist-users. We describe a two-site shared systems administration model as well as the architectural details of the computing, storage, network and grid services infrastructure.
        Speaker: Robert Gardner (University of Chicago)
      • 300
        The Simulation of Polarized Baryons in the LHC ATLAS using EvtGen.
        We will report on a set of studies we have conducted to assess the feasibility of measuring the polarization of lambda_b hyperons in the CERN ATLAS experiment by making the first successful adaptation of the generation package EvtGen for polarized spin-1/2 particles. The simulations were based on the EvtGen version of ATLAS, a product of ATLAS EvtGen project, reported in other ATLAS abstract to this conference. The lambda_b simulations have permitted us to develop an efficient method to generate polarized hyperons in the ATLAS detector and to carry out a number of precision tests which replicate the real challenges that will be faced in the actual experiment. We will report on the details of the techniques we used to implement EvtGen for our physics application and on the various test we conducted to validate its performance.
        Speaker: Prof. Homer Alfred Neal (University of Michigan)
        Paper
        Poster
      • 301
        The STAR Trigger Data Pusher
        We describe a new, high-speed trigger network for the STAR detector at RHIC to be used during the upcoming 2006 run and thereafter. The STAR Trigger Data Pusher (STP) replaces the off-the-shelf Myrinet network used in the STAR trigger system during the first five RHIC runs. The STP will lower latencies and increase bandwidth through the trigger system. Custom electronics provide flexibility in implementing flow control, buffering, and debugging. The STP network consists of PCI Mezzanine Cards for each detector, a central data concentrator, a PCI receiver card in the Level-2 processor, and all associated software. The new network connects the `fast detectors' within STAR using optical fibers allowing faster trigger decisions on greater data volume. Event data from each detector flows through a central concentrator to the Level-2 processor, also over optical fiber. The STP network enables new Level-2 triggers that utilize more data from each collision as well as higher event rates. In this paper we describe the design, implementation and use of the new STP network as well as report latencies, bandwidth, and newly available trigger rates.
        Speaker: Mr Chris Perkins (STAR)
      • 302
        The structure of the new ROOT Mathematical Software Libraries
        Aiming to provide and support a coherent set of libraries, the mathematical functionality of the ROOT project has been reorganized following a merge of the ROOT and SEAL activities. Two new libraries, coded in C++, have been released in ROOT version 5: MathCore (basic functionality) and MathMore (functionality for advanced users). We present the structure and design of these new libraries, including a detailed description of their components.
        Speaker: Dr Lorenzo Moneta (CERN)
      • 303
        The ZEUS Grid-Toolkit - an experiment independent layer to access Grid services
        The HERA luminosity upgrade and enhancements of the detector have led to considerably increased demands on computing resources for the ZEUS experiment. In order to meet these higher requirements, the ZEUS computing model has been extended to support computations in the Grid environment. We show how to use the Grid services in the production system of a real experiment and point out the main issues, which must be addressed in order to use the Grid resources routinely and efficiently. We present the ZEUS Grid-toolkit designed as an additional layer between the Grid and experiment specific software. It provides a general interface for job management and data handling, which makes our application software independent of the actual Grid middleware software version. Different Grid middleware implementations as LCG or Grid2003 may be used simultaneously and smooth migration is possible as new middleware implementations appear (gLite). The job efficiency is significantly improved by introducing fault tolerant methods. The toolkit uses extensible Perl classes for job management and implements additional features like dynamic creation of job description, automatic job resubmission and validation of the job results. The toolkit has been successfully used in the integrated ZEUS Monte Carlo production system for more than a year.
        Speaker: Mr Krzysztof Wrona (Deutsches Elektronen-Synchrotron (DESY),Germany)
        Paper
      • 304
        Title: An Inter-Regional Grid Enabled Center for High Energy Physics Research and Educational Outreach (CHEPREO) at Florida International University in collaboration with California Institute of Technology, Florida State University and the University of Florida
        Florida International University (FIU), in collaboration with partners at Florida State University (FSU), the University of Florida (UF), and the California Institute of Technology (Caltech), in cooperation with the National Science Foundation, are creating and operating an interregional Grid-enabled Center for High-Energy Physics Research and Educational Outreach (CHEPREO) at FIU, encompassing an integrated program of research, network infrastructure development, and education and outreach at one of the largest minority schools in the US. CHEPREO will extend FIU’s existing research activities at Jefferson National Laboratory to the long-term high-energy physics research program at the Compact Muon Solenoid (CMS) experiment at CERN, create a robust outreach activity based on CMS research, develop an advanced networking and Grid computing infrastructure that will draw in new collaborators from South America, and enhance science and math education in South Florida for underserved minority students through pedagogic enhancements and teacher training led by a Physics Learning Center (PLC).
        Speakers: Heidi Alvarez (Florida International University), Dr Paul Avery (University of Florida)
      • 305
        Using ROOT, Windows, Linux at DØ. To do physics
        DØ is a traditional High Energy Physics collider experiment located at the Tevatron at Fermilab. Similar to recent past and most future experiments almost all computing work is done on Linux using standard open source tools like the gcc compiler, the make utility, and ROOT. I have been using the Microsoft platform for quite some time to develop physics tools and algorithms. Once developed code is uploaded to cvs and production running is done on Linux farms and batch systems. The potential advantages of this system are the tools available on Windows, primarily the development environment and the debugger. However, translation between the two worlds is not easy. This poster will describe the tools and processes used to accomplish this work, along with some discussion of the state of development in both the Linux and Windows.
        Speaker: Gordon Watts (University of Washington)
        Poster
      • 306
        VALIDATION OF GEANT4 BERTINI CASCADE NUCLIDE PRODUCTION USING PARALLEL ROOT FACILITY
        We present an investigation to validate Geant4 [1] Bertini cascade nuclide production by proton- and neutron-induced reactions on various target elements [2]. The production of residual nuclides is calculated in the framework of an intra-nuclear cascade, pre-equilibrium, fission, and evaporation model [3]. A 132 CPU Opteron Linux cluster running the NPACI Rocks Cluster Distribution [4, 5] based on Red Hat Enterprise Linux has been used to compute cross-section results for the Bertini cascade. We have used the new features of the Parallel ROOT Facility (PROOF), distributed with ROOT version 5, to analyse the cross-section data. Automatic class generation for PROOF event data-analysis has been used on the Rocks cluster [6, 7]. Performance results for the cluster as measured with the ProofBench package are also presented. [1] Geant4 Collaboration, "GEANT4: A Simulation Toolkit", Nuclear Instruments and Methods in Physics Research, NIM A 506 (2003), 250-303. [2] A. Heikkinen, "Validation of Geant4 Bertini cascade nuclide production", Proceedings of FrontierScience 2005, Milan, Italy, September 12-17, 2005. [3] A. Heikkinen, N. Stepanov, and J.P. Wellisch, "Bertini intra-nuclear cascade implementation in Geant4", arXiv: nucl-th/0306008. [4] P. Papadopoulos, M. Katz, and G. Bruno, "NPACI Rocks: Tools and Techniques for Easily Deploying Manageable Linux Clusters", Concurrency Computat: Pract. Exper. 2002; 00:1-20. [5] A. Heikkinen and T. Linden, "Validation of the GEANT4 Bertini Cascade model and data analysis using the Parallel ROOT Facility on a Linux cluster", Proceedings of Computing in High Energy Physics, CHEP'04, Interlaken, Switzerland, 26.10.-1.10. 2004. [6] F. Rademakers, M. Goto, P. Canal, R. Brun, "ROOT Status and Future Developments", arXiv: cs.SE/0306078. [7] M. Balli et al., "Parallel Interactive and Batch HEP-Data Analysis with PROOF", Proceeding of ACAT05, May 22 - 27, 2005 DESY, Zeuthen, Germany.
        Speaker: Aatos Heikkinen (Helsinki Institute of Physics)
        LaTeX file
        Paper
        picture
        picture
        picture
        picture
      • 307
        VOMS deployment for small national VOs and local groups.
        With the development of the grid and the acquisition of large clusters to support major HEP experiments on the grid. Has triggered different requests One is from local physicist from the major VOs to have privileged access to their resources and the second is to support smaller groups that will never have access to this amount of resources. Unfortunately both these categories of users up don't even have the resources to maintain VO servers needed to access the grid. The use of a centralised VOMS server at least at national level as part of the grid infrastructure can resolve the problem. In the following there is a description of the deployment of such a VOMS server and the VOs administration for this purpose.
        Speaker: Dr Alessandra Forti (University of Manchester)
        Paper
        Poster
      • 308
        Xen and OpenVirtuozzo: two different approaches to server and services virtualization
        Virtualization is a methodology of dividing the resources of a computer into multiple execution environments, by applying one or more concepts or technologies such as hardware and software partitioning, time-sharing, partial or complete machine simulation, emulation, quality of service, and many others. These techniques can be used to consolidate the workloads of several under-utilized server to fewer machines, to run legacy applications which might simply not run on newer hardware, to provide secure and isolated sandboxes for running untrusted or potentially insecure applications, to provide powerful debugging environments and test scenarios. Xen is an hypervisor which runs multiple guest operating systems with kernels ported to a special arch very close to normal x86, with strong isolation between virtual machines and execution performance close to native processors. OpenVirtuozzo is an operating system-level virtualization solution based on Linux. Any OpenVZ virtual server behaves like a regular Linux system, isolated from each other (file system, processes, IPC), but shares a single OS image ensuring that applications do not conflict. In this paper we describe our experience and test results with these powerful tools, used in our IT infrastructure to provide a wide set of core services, like dns, mail, network printing and client deployment.
        Speaker: Mr Francesco Maria Taurino (CNR/INFM - INFN - Dip. di Fisica Univ. di Napoli "Federico II")
        Paper
        Poster
      • 309
        XrdSec - A high-level C++ interface for security services in client-server applications
        XrdSec is the security framework developed in the context of the XROOTD project. It provides a high-level abstract security interface for client-server applications. Concrete implementations of the interface can be written for any security protocol as plugin libraries, where all technical details about the protocol are confined. Clients and server administrators can configure the system behaviour using environment variables and/or configurations files. The framework naturally provides server access control and simple client/server negotiation. The result of successful handshake is a security context object containing the session-key and providing an API for encryption/decryption over the open channel. XrdSec is written in C++ and can be easily integrated in any client-server application. In this paper we will describe the underlying architecture, the protocol plugins currently available (password-based, Kerberos, GSI) and a few examples of usage, like a simple client-server application and the integration in ROOT.
        Speaker: Gerardo GANIS (CERN)
      • 310
        xrootd Server Clustering
        Server clustering is an effective method in increasing the pool of resources available to applications. Many clustering mechanisms exist; each with its own strengths as well as weaknesses. This paper describes the mechanism used by xrootd to provide a uniform data access space consisting of an unbounded number of independent distributed servers. We show how the mechanism is especially effective in reducing data request routing latency as well as eliminating most of the cluster definition details to allow super large clusters to be constructed on-the-fly with a minimum amount of administration.
        Speaker: Andrew Hanushevsky (Stanford Linear Accelerator Center)
        Poster
    • 10:30
      Tea Break
    • Plenary: Plenary 6 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Don Petravick (Fermilab)
      • 311
        Distributed Data Management Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Peter Elmer (PRINCETON UNIVERSITY)
        Slides
      • 312
        ROOT in the era of multi-core CPUs Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Rene Brun (CERN)
        Slides
    • 12:15
      Lunch Break
    • Computing Facilities and Networking: CFN-5 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 313
        Measuring Quality of Service on Nodes in a Cluster
        It is important to know the Quality of Service offered by nodes in a cluster both for users and load balancing programs like LSF, PBS and CONDOR for submitting a job on to a given node. This will help in achieving optimal utilization of nodes in a cluster. Simple metrics like load average, memory utilization etc do not adequately describe load on the nodes or Quality of Service (QoS) experienced by user jobs on the nodes.We had undertaken a project to predict Quality of Service seen by user job on a cluster node by correlating simple metrics like Load Average, Memory Utilization and IO on the node. This paper presents our efforts and methodology we have followed for predicting QoS of nodes in a cluster.Brief description of approach followed – User jobs are divided mainly as CPU intensive, Memory intensive and I/O intensive. We created probe programs to represent each type of job. We have also created load programs to generate different types of loads in the system. We used EDG-Fabric-Monitoring System to monitor system metrics on nodes in cluster. Execution time of sample probe programs and system metrics values were measured under different load conditions. We tried to correlate execution time of probe programs with values of system metrics. This correlation metric gives better measure of Quality of Service (QoS) experienced by user programs (probes) in the system. Based upon our experiences we added a metric called ‘VmstatR’ in monitoring system.We have derived QoS metric in 3 different ways. I) by using Unix Load average metric, II) by using VmstatR metric III) By using CPU utilization and load on the node. We will discuss variations between measured execution time for sample probe programs and execution time predicted by QoS metric derived in above-mentioned manner.We have also studied behaviour of CMSIM (simulation) and ORCA (reconstruction) programs under various load conditions and tried to find correlation metric to predict QoS for these jobs. Finally, we will present difficulties experienced in predicting Quality of Service on nodes in a cluster.
        Speaker: Mr Rohitashva Sharma (BARC)
        Paper
        Slides
      • 314
        Lambda Station: Production Applications Exploiting Advanced Networks in Data Intensive High Energy Physics.
        High Energy Physics collaborations consist of hundreds to thousands of physicists and are world-wide in scope. Experiments and applications now running, or starting soon, need the data movement capabilities now available only on advanced and/or experimental networks. The Lambda Station project steers selectable traffic through site infrastructure and onto these "high-impact" wide-area networks. Lambda Station also controls ingress and egress filters between the site and the high-impact network and takes responsibility for negotiating with reservation or provisioning systems that regulate the WAN control plane, be it based on SONET channels, demand tunnels, or dynamic optical links. This article will discuss design principles, the current status of the project, the results achieved up to date, and challenges surmounted building Lambda Station aware applications via DOE's UltraScience Net and ESnet networks and ULtraLight between Fermilab, Caltech, and other sites.
        Speaker: Mr Andrey Bobyshev (FERMILAB)
        Paper
        Slides
      • 315
        Managing small files in Mass Storage systems using Virtual Volumes
        Efficient hierarchical storage management of small size files continues to be a challenge. Storing such files directly on tape-based tertiary storage leads to extremely low operational efficiencies. Commercial tape virtualization products are few, expensive and only proven in mainframe environments. Asking the users to deal with the problem by “bundling” their files leads to a plethora of solutions with high maintenance costs. Part of the problem is that data processing environments have evolved towards the illusion of an infinite file store with a subdirectory structure, eliminating the concept of a volume, be it physical or logical. Research has been undertaken to deal with these issues at the data center level, but the outcome is quite simple and can be used in general. Results are presented of prototype implementations of a paradigm termed "Virtual Volumes", which combines standard operating system tools such as symbolic links and auto-mounters together with techniques to represent a volume as a file such as the ISO 9660 specification. Virtual Volumes allow large number of files to be handled as a single item in tertiary tape storage systems, whilst maintaining the infinite file store illusion towards the user by mounting these single items as branches within a file system. Whereas a totally general implementation of Virtual Volumes would require quite complex coding, the prototypes presented are optimized for the Write-Once-Read-Many (WORM) environment often found in scientific data applications. The choice to base the Virtual Volume implementation on standard operating system tools and techniques means that it can be easily combined with already existing or future tools used in HEP experiments and Grid Infrastructures. Examples are given of handling HEP data using Virtual Volumes integrated into the existing data frameworks of several experiments.
        Speaker: Prof. Manuel Delfino Reznicek (Port d'Informació Científica)
        Paper
        Slides
      • 316
        Development of the Tier-1 Facility at Fermilab
        CMS is preparing seven remote Tier-1 computing facilities to archive and serve experiment data. These centers represent the bulk of CMS's data serving capacity, a significant resource for reprocessing data, all of the simulation archiving capacity, and operational support for Tier-2 centers and analysis facilities. In this paper we present the progress on deploying the largest remote Tier-1 facility for CMS, located at Fermilab. We will present the development, procurement and operations experiences during the final two years of preparation. We will discuss the development and deployment to support grid interfaces for the Worldwide LHC Computing Grid and the Open Science Grid on the same physical resources. We will outline the hardware selection and procurement and plans for the future to meet the needs of the experiment and the constraints of the physical facility. We will also discuss the successes and challenges associated with enabling a mass storage system to meet the various experimental needs at a significant increase in scale over what is currently achievable. Finally we will discuss the model to support US Tier-2 centers from the Tier-1 facility.
        Speaker: Dr Ian Fisk (FERMILAB)
        Slides
    • Distributed Data Analysis: DDA-5 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 317
        Distributed CMS Analysis on the Open Science Grid
        The CMS computing model provides reconstruction and access to recorded data of the CMS detector as well as to Monte Carlo (MC) generated data. Due to the increased complexity, these functionalities will be provided by a tier structure of globally located computing centers using GRID technologies. In the CMS baseline, user access to data is provided by the CMS Remote Analysis Builder (CRAB) analysis tool which enables the user to execute analysis applications on locally resident data using GRID tools independent of the geographical location. Currently, mostly two different toolkits provide the needed functionalities, the Worldwide LHC Computing Grid (LCG) and the OpenScience Grid (OSG). Due to infrastructure and service differences between the two toolkits, analysis tools developed for one are frequently not immediately compatible with the other. In this paper, we will describe the development of additions to the CRAB tool to run user analysis on OSG sites. We will discuss the approach of using the GRID submission of the CONDOR batch system (CONDOR-G) to provide a sandbox functionality for the user's analysis job. For LCG sites, this is provided amongst other things by the resource broker. We will discuss the differences of user analysis on LCG and OSG sites and present first experiences running CMS user jobs at OSG sites.
        Speaker: Oliver Gutsche (FERMILAB)
        Paper
        Slides
      • 318
        DIANA Scheduler
        Results from and progress on the development of a Data Intensive and Network Aware (DIANA) Scheduling engine primarily for data intensive sciences such as physics analysis is described. Scientific analysis tasks can involve thousands of computing, data handling, and network resources and the size of the input and output files and the amount of overall storage space allotted to a user necessarily has significant bearing on the scheduling of data intensive applications. If the input or output files must be retrieved from a remote location, then the time required transferring the files must be taken into consideration when scheduling compute resources for the given application. The central problem in this study is the coordinated management of computation and data at multiple locations and not simply data movement. However, this can be a very costly operation and efficient scheduling can be a challenge if compute and data resources are mapped without network cost. This can result in performance degradation if the advantage of recent advances in networking technologies and bandwidth abundance based on optical backbones is not delegated to a scheduling engine. To incorporate these features, we have implemented an adaptive algorithm within the DIANA Scheduler which takes into account data location and size, network performance and computation capability to make efficient global scheduling decisions. DIANA is a performance-aware as well as an economy-guided Meta Scheduler. It iteratively allocates each job to the site that is likely to produce the best performance as well as optimizing the global queue for any remaining pending jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results suggest that considerable performance improvements are to be gained by adopting the DIANA approach and this makes it a very suitable Meta Scheduler for Physics Analysis.
        Speaker: Mr Ashiq Anjum (University of the West of England)
        Paper
        Slides
      • 319
        Experience with distributed analysis in LHCb
        Physics analysis of large amounts of data by many users requires the usage of Grid resources. It is however important that users can see a single environment for developing and testing algorithms locally and for running on large data samples on the Grid. The Ganga job wizard, developed by LHCb and ATLAS, provides physicists such an integrated environment for job preparation, bookkeeping and archiving, job splitting and merging and allows job submission to a large variety of back-ends. Ganga can be used from a python Command Line Interface, from python scripts or from a GUI. The LHCb baseline back-end for accessing Grid resources is the DIRAC Workload Management System. The DIRAC Workload Management system implements many advanced methods to minimize the response time for the user analysis jobs as well as to maximize the success rate of user tasks. It provides an easy way for clients to securely submit and monitor their jobs and to retreive the results. We present here the experience of using Ganga for submitting analysis jobs to the Grid through the DIRAC WMS, directly to LCG, and to a local batch system. Performances of the various back-ends are compared.
        Speaker: Dr Ulrik Egede (IMPERIAL COLLEGE LONDON)
        Slides
      • 320
        Panda: Production and Distributed Analysis System for ATLAS
        A new offline processing system for production and analysis, Panda, has been developed for the ATLAS experiment and deployed in OSG. ATLAS will accrue tens of petabytes of data per year, and the Panda design is accordingly optimized for data intensive processing. Its development followed three years of production experience, the lessons from which drove a markedly different design for the new system. Key design features include - data-driven workflow with managed placement of datasets (file collections) at processing sites; job-placement that assures jobs are dispatched to sites holding their input data - tight integration with the ATLAS DDM system Don Quijote 2 which provides all data management services - late binding of jobs to worker nodes for dynamic and flexible prioritization and scheduling, and to isolate workloads from latencies and failure modes in acquiring processing resources - a service oriented architecture with a fast and lightweight communication layer based on REST-style web services - queue management, data placement, job dispatching, and processor acquisition operate as asynchronous services, with support for multiple instances, for fast throughput and maximum scalability. In this paper we motivate and describe the design and implementation of the system, the current state of its deployment for production and analysis operations, and the work remaining to achieve readiness for ATLAS datataking.
        Speaker: Prof. Kaushik De (UNIVERSITY OF TEXAS AT ARLINGTON)
        Slides
    • Distributed Event production and Processing: DEPP-5 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 321
        DIRAC, the LHCb Data Production and Distributed Analysis system
        DIRAC is the LHCb Workload and Data Management system used for Monte Carlo production, data processing and distributed user analysis. It is designed to be light and easy to deploy which allows integrating in a single system different kinds of computing resources including stand-alone PC's, computing clusters or Grid systems. DIRAC uses the paradigm of the overlay network of “Pilot Agents”, which makes it very resilient with respect to various sources of instabilities in the underlying computing resources. DIRAC is used routinely in LHCb for MC data production and has powerful Production Manager tools to easily formulate and automatically steer tasks with complex workflows. The recent extensions for distributed analysis allow LHCb users to run their jobs on LCG reliably while benefiting from the DIRAC job monitoring and data management facilities. In this paper we present an overview of the system, its main components and their interaction with LCG services and resources. The functionality with different types of workload is described. The experience of the DIRAC use in the recent Data and Service challenges will be highlighted together with the outlook for future development necessary to comply with the production service requirements.
        Speaker: Dr Andrei TSAREGORODTSEV (CNRS-IN2P3-CPPM, MARSEILLE)
        Slides
      • 322
        Geant4 simulation in a distributed computing environment
        The quantitative results of a study concerning Geant4 simulation in a distributed computing environment (local farm and LCG GRID) are presented. The architecture of the system, based on DIANE, is presented; it allows to configure a Geant4 application transparently for sequential execution (on a single PC), and for parallel execution on a local PC farm or on the GRID. Quantitative results concerning the efficiency of the system, overheads introduced by the DIANE system, latency for job execution and the optimisation of the job configuration (number of nodes and tasks) are presented. The quantitative results concern two typical experimental use cases studied in the project: 1) time-consuming simulations requiring "quasi-online" response of the order of a few minutes (e.g. studies required for detector design optimisation), and 2) high statistics, high-precision, computing-intensive simulations (e.g. simulation productions for physics studies). To our knowledge, this study represents the first quantitative evaluation of Geant4 simulation applications in real-life distributed computing environments, and comparison of simulation applications in a PC farm or on the GRID.
        Speakers: Mr Jakub Moscicki (CERN), Dr Maria Grazia Pia (INFN GENOVA), Dr Patricia Mendez Lorenzo (CERN), Dr Susanna Guatelli (INFN Genova)
        Slides
      • 323
        GANGA - A GRID User Interface
        Ganga is a lightweight, end-user tool for job submission and monitoring and provides an open framework for multiple applications and submission backends. It is developed in a joint effort in LHCb and ATLAS. The main goal of Ganga is to effectively enable large-scale distributed data analysis for physicists working in the LHC experiments. Ganga offers simple, pleasant and consistent user experience in a variety of heterogeneous environments: from local clusters to global GRID systems. Ganga helps end-users organize the analysis activities on the GRID by providing automatic persistency of the job's metadata. A user has full access to the jobs submitted in the past including their configuration and input/output. Automatic status monitoring and output retrieval simplify the usage of the tool. Job splitting allows a very efficient handling of large numbers of similar jobs using different datasets. Job templates provide a convenient mechanism to support repetitive tasks. Ganga is an open development framework and has a clear internal architecture. Ganga Public Interface (GPI) is a python-based, user-centric API that is a key component of the system. GPI combines the consistency and flexibility of the programming interface with intuitive and concise usage. GPI may be used for writing complex, user-specific scripts or in the interactive python shell. A Qt-based graphical user interface is a GPI overlay which integrates scripting and graphical capabilities into a single environment. GPI may also be embedded as a library in a third-party framework and be used as convenient abstraction layer for job submission and monitoring. Release 4 of Ganga contains optimized handlers for ATLAS/Athena and LHCb/Gaudi applications which are interfaced to a number of generic execution backends (LSF, LCG, gLite) as well as experiment-specific workload management systems (LHCb's DIRAC and ATLAS production system). Other applications, such as Geant4 simulation in medical physics, or BLAST protein alignment algorithm in biotechnology have been successfully run with Ganga. Ganga fully exploits the plug-in architecture that makes the integration of new applications and backend very easy.
        Speaker: Karl Harrison (High Energy Physics Group, Cavendish Laboratory)
        Paper
        Slides
      • 324
        BaBar simulation production - changes in CM2
        For the BaBar computing group: Two years ago BaBar changed from using a database event storage technology to the use of ROOT-files. This change drastically affected the simulation production within the experiment, as well as the bookkeeping and the distribution of the data. Despite these large changes to production, events were produced as needed and on time for analysis. In fact the changes made the production the most efficient in BaBar's history, and the easiest for production managers. The first production cycle within this new computing model is now complete, and resulted in 2.9B events produced in less than a year, and at 22 different computing sites. The changes to the production system, the bookkeeping, and distribution systems will be discussed. The size and scope of computing resources needed the last finished production cycle, and the current cycle, will also be shown.
        Speaker: Dr Douglas Smith (STANFORD LINEAR ACCELERATOR CENTER)
        Slides
    • Event Processing Applications: EPA-5 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 325
        Track reconstruction with the ATLAS Detector
        This talk presents new methods to address the problem of muon track identification in the monitored drift tube chambers (MDT) of the ATLAS Muon Spectrometer. Pattern recognition techniques, employed by the current reconstruction software suffer when exposed to the high background rates expected at the LHC. We propose new techniques, exploiting existing knowledge of the detector performance in the presence of background, in order to improve tracking efficiency. The efficiency of the MDT tubes is very high. However, in a high background environment, there are two possible cases, for which a signal might not be registered when a particle has passed through an active tube: the existence of a previous background hit, giving rise to electronic dead time, and insufficient ionization in cases where the track crosses (very close to) the tube wall. Taking this into account, we derive a mathematical expression for the effective muon hit probability. We then model the track identification problem as a two-hypothesis problem, and base the decision on a generalized likelihood ratio test (GLRT). Since the effective muon hit probability is very high, we can choose a higher likelihood threshold, reducing the probability of finding false tracks without reducing the probability of track detection. In order to solve the track detection problem, we employ a novel modification of the Hough transform, with several values in each cell, different for each potential case. These values are then used for calculating the muon track likelihood. Examining data from beam tests with realistic background levels, we show that the use of this technique results in a significant improvement of the muon track detection performance of the MDT.
        Speaker: Mr David Primor (Tel Aviv University, ISRAEL (CERN))
        Paper
        Slides
      • 326
        Adaptive on-the-fly calibration of TPC distortions
        The Solenoid Tracker At RHIC (STAR) experiment has observed luminosity fluctuations on time scales much shorter than expected during its design and construction. These operating conditions lead to rapid variations in distortions of data from the STAR TPC which are dependent upon the luminosity and planned techniques for calibrating these distortions became insufficient to provide high quality physics data. We present a novel method we developed to perform such calibrations event-by-event which adapts to the varying conditions on-the-fly, making the need for a pre-calibartion pass unnecessary. We discuss its strengths, weaknesses and alternatives.
        Speaker: Dr GENE VAN BUREN (BROOKHAVEN NATIONAL LABORATORY)
        Slides
      • 327
        RecPack, a general reconstruction toolkit
        RecPack is a general reconstruction toolkit, which can be used as a base for any reconstruction program for a HEP detector. Its main functionalities are track finding, fitting, propagation and matching. Track fitting can be done either via conventional least squares methods or Kalman Filter techniques. The last, in conjunction with the matching package, allows simultaneous track finding and fitting. The navigation package permits the propagation of the fitted trajectories to any surface within the detector, taking into account effects such as multiple scattering, energy loss and inhomogeneous magnetic fields. In addition, a simple simulation package for debugging of reconstruction algorithms is provided. All the algorithms of RecPack are independent of the setup, which makes the toolkit completely general. The geometry package has all necessary methods to build complicated detectors from simple individual blocks: box, tube, sphere, rectangle, ring, etc. In addition, any new propagation model, measurement type, volume type, etc, can be added to the system very easily. RecPack was born in the HARP experiment, at CERN, but it is used at the moment by other experiments: MICE, MuScat, K2K, T2K and LHCB (trigger studies). Several developments are going on: generalization of pattern recognition algorithms, GUI for visualization of reconstruction processes step-by-step, etc.
        Speaker: Dr Anselmo Cervera Villanueva (University of Geneva)
        Paper
        paper cls file
        paper figure
        paper figure
        paper figure
        paper latex file
        Slides
      • 328
        Track based alignment of composite detector structures
        Modern tracking detectors are composed of a large number of modules assembled in a hierarchy of support structures. The sensor modules are assembled in ladders or petals. Ladders and petals in turn are assembled in cylindrical or disk-like layers and layers are assembled to make a complete tracking device. Sophisticated geometrical calibration is essential in these kind of detector systems in order to fully exploit the high resolution of sensors. The position and orientation of individual sensors in the detector have to be calibrated with an accuracy better than the intrinsic resolution, which in the CMS Silicon Tracker is ranging from about 10 um to 50 um. Especially if no hardware alignment system is available (which is the case for the CMS Pixel), the fine tuning of the sensors needs to be carried out with particle tracks. There are about 20000 independent sensors in the CMS tracker and of the order 10^5 calibration constants are needed for the alignment. The alignment algorithm needs to be computationally practical, especially if considered to provide almost on-line feedback. We present an effective algorithm to perform fine calibration of individual sensor positions as well as alignment of composite structures consisting of a number of pixel or strip sensors. The alignment correction of a composite structure moves the individual sensors like a rigid body under a rotation and translation of the structure. Up to six geometric parameters, three for location and three for orientation, can be computed for each sensor on a basis of particle trajectories traversing the detector system. The performance of the methods is demonstrated with both simulated tracks and tracks reconstructed from experimental data taken with a cosmic rack.
        Speaker: Mr Tapio Lampen (HELSINKI INSTITUTE OF PHYSICS)
        Paper
        Slides
      • 329
        Software Solutions for a Variable ATLAS Detector Description
        This talk addresses two issues related to the implementation of a variable software description of the ATLAS detector. The first topic is how we implement an evolving description of an evolving ATLAS detector, including special configurations at varying levels of realism, in a way which plugs into the simulation and reconstruction software. The second topic is how time-dependent alignment information is incorporated into the detector model. Both types of functionality use dedicated databases. The primary source of information for the detector configuration is a relational database using a hierarchical versioning system. The primary source of alignment information is the ATLAS conditions database, which is organized primarily by validity interval and is accessed by various ATLAS applications through COOL API. The two types of information are merged seamlessly into a single, time-dependent transient detector store.
        Speaker: Vakhtang Tsulaia (UNIVERSITY OF PITTSBURGH)
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-5 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 330
        A Grid of Grids using Condor-G
        The Condor-G meta-scheduling system has been used to create a single Grid of GT2 resources from LCG and GridX1, and ARC resources from NorduGrid. Condor-G provides the submission interfaces to GT2 and ARC gatekeepers, enabling transparent submission via the scheduler. Resource status from the native information systems is converted to the Condor ClassAd format and used for matchmaking to job Requirements and Rank by the Condor Negotiator. The use of custom external functions by the Negotiator during matchmaking, provides versatility to develop job placement strategies. For example, a function exist to use a matrix of CE-to-SE bandwidths, together with data location information, to make 'network closeness' available to both Requirements and Rank expressions. Other examples where such flexability can be applied are in implementing a feedback loop to dynamically prefer successful or fast resources, and block matching to blackholes. The Condor-G Grid of LCG resources has produced 180,000 jobs during recent ATLAS productions, which matches the number produced by the LCG Workload Management System in the same period. GridX1 resources were used for ATLAS production in this way starting Autumn 2005. Simple jobs have been matched and ran on the full Grid federation, including NorduGrid resources, and work is underway to make use of advanced ARC features allowing Condor-G submission of ATLAS production on all resource flavours.
        Speaker: Dr Rodney Walker (SFU)
        Paper
        Slides
      • 331
        Flexible job submission using Web Services: the gLite WMProxy experience
        Contemporary Grids are characterized by a middleware that provides the necessary virtualization of computation and data resources for the shared working environment of the Grid. In a large-scale view, different middleware technologies and implementations have to coexist. The SOA approach provides the needed architectural backbone for interoperable environments, where different providers can offer their solutions without restricting users to just one specific implementation. The WMProxy (Workload Manager Proxy) is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. The WMProxy was designed to efficiently handle a large number of requests for job submission and control to the WMS and the service interface addresses the Web Services and SOA architecture standards, in particular adhering to the WS- Interoperability basic profile. In this paper we describe the WMProxy service: from its architecture, independent from the used Web Services container, up to the provided functionality, all together with the rationale behind the decisions made during both the design and implementation phases. In particular, we provide a description of how the WMProxy is integrated with the gLite Workload Management System; the used technologies, focusing on the Web Services features; the mechanisms adopted to improve performances still keeping high reliability and fault-tolerance; the changes in the job submission operation chain with respect to the previous generation of Workload Management System and the new operations provided in order to support bulk-submission and improve Client-Server interaction capabilities.
        Speaker: Giuseppe AVELLINO (Datamat S.p.A.)
        Paper
        Slides
      • 332
        An Edge Services Framework (ESF) for EGEE, LCG, and OSG
        We report on first experiences with building and operating an Edge Services Framework (ESF) based on Xen virtual machines instantiated via the Workspace Service available in Globus Toolkit, and developed as a joint project between EGEE, LCG, and OSG. Many computing facilities are architected with their compute and storage clusters behind firewalls. Edge Services are instantiated on a small set of gateways to provide access to these clusters via standard grid interfaces. Experience on EGEE, LCG, and OSG has shown that at least two issues are of critical importance when designing an infrastructure in support of Edge Services. The first concerns Edge Service configuration. It is impractical to assume that each virtual organization (VO) using a facility will employ the same Edge Service configuration, or that different configurations will coexist easily. Even within a VO, it should be possible to run different versions of the same Edge Service simultaneously. The second issue concerns resource usage: since Edge Services may become a bottleneck to a site, it is essential that an ESF be able to effectively arbitrate resource usage (e.g., memory, CPU, and networking) among different VOs. By providing virtualization on the level of instruction set architecture, virtual machines allow configuration of independent software stacks for each VM executing on a resource. Modern implementations of this abstraction are extremely efficient and have outstanding fine-grained enforcement capabilities. To securely deploy virtual machines, we use the Workspace Service from the Globus Toolkit, which allows a VO administrator to dynamically launch appropriately-configured system images. In addition, we are developing a library of such images, reflecting the needs of presently participating communities ATLAS, CMS, and CDF. We will report on first experiences building and operating this Edge Services Framework.
        Speaker: Abhishek Singh RANA (University of California, San Diego, CA, USA)
        Paper
        Slides
      • 333
        Virtualisation: Performance and Use Cases
        One problem in distributed computing is bringing together application developers and resource providers to ensure that applications work well on the resources provided. A layer of abstraction between resources and applications provides new possibilities in designing Grid solutions. This paper compares different virtualisation environments, among which are Xen (developed at the University of Cambridge / UK), User-Mode-Linux (a linux community effort) and MS Virtual Server (a commercial solution). The differences in architecture and features will be presented. The results of our intensive performance measurements that have been carried out on all of these virtualisation environments will be discussed. Furthermore use cases that highlight solutions to typical problems in distributed computing with particular emphasis on Grid computing will be presented.
        Speaker: Mr Marcus Hardt (Unknown)
        Paper
        Slides
    • Online Computing: OC-5 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 334
        Control and monitoring of on-line trigger algorithms using a SCADA system
        LHCb has an integrated Experiment Control System (ECS), based on the commercial SCADA system PVSS. The novelty of this control system is that, in addition to the usual control and monitoring of all experimental equipment, it also provides control and monitoring for software processes, namely the on-line trigger algorithms. The trigger decisions are computed by algorithms on an event filter farm of around 2000 PCs. They are prepared using Gaudi, the LHCb software framework. Gaucho, the GAUdi Component Helping Online, was developed to allow the control and monitoring of Gaudi algorithms. Using Gaucho, algorithms can be monitored from the run control system provided by the ECS. To achieve this, Gaucho implements a hierarchical control system using Finite State Machines. Gaucho consists of three parts: a C++ package integrated with Gaudi, the communications package DIM, and a PVSS backend providing the user interface. Using the PVSS user interface (the run control), algorithms can be stopped/started and counters and histograms can be followed in real-time. The results are combined at the level of nodes, subfarms and the full farm, so that it is easy to verify the correct functioning of the trigger. In this article we describe the Gaucho architecture, the experience of monitoring a large number of software processes and some requirements for future extensions.
        Speaker: Dr Eric van Herwijnen (CERN)
        Paper
        Slides
      • 335
        Strategies and Tools for ATLAS Online Monitoring
        ATLAS is one of the four experiments under construction along the Large Hadron Collider (LHC) ring at CERN. The LHC will produce interactions at a center of mass energy equal to $\sqrt s~=~14~TeV$ at a $40~MHz$ rate. The detector consists of more than 140 million electronic channels. The challenging experimental environment and the extreme detector complexity impose the necessity of a common scalable distributed monitoring framework, which can be tuned for the optimal use by different ATLAS sub-detectors at the various levels of the ATLAS dataflow. This note presents the architecture of this monitoring software framework, and describes its current implementation, which has already been used at the ATLAS beam test activity in 2004. Preliminary performance results, obtained on a computer cluster consisting of 700 nodes, will also be presented, showing that the performance of the current implementation is in the range of the final ATLAS requirements.
        Speaker: Dr Wainer Vandelli (Università and INFN Pavia)
        Paper
        Slides
      • 336
        Testing on a large scale: Running the Atlas Data Acquisition and High Level Trigger software on 700 pc nodes
        The Atlas Data Acquisition (DAQ) and High Level Trigger (HLT) software system will be comprised initially of 2000 PC nodes which take part in the control, event readout, second level trigger and event filter operations. This high number of PCs will only be purchased before data taking in 2007. The large CERN IT lxbatch facility provided the opportunity to run in July 2005 online functionality tests over a period of 5 weeks on a stepwise increasing farm size from 100 up to 700 pc dual nodes. The interplay between the control and monitoring software with the event readout, event building and the trigger software has been exercised the first time as an integrated system on this large scale. New was also to run algorithms in the online environment for the trigger selection and in the event filter processing tasks on a larger scale. A mechanism has been developed to package the offline software together with the DAQ/HLT software and to distribute it via peer-to-peer software efficiently to this large pc cluster. The findings obtained during the tests lead to many immediate improvements in the software. Trend analysis allowed identifying critical areas. Running an online system on a cluster of 700 nodes successfully was found to be especially sensitive to the reliability of the farm as well as the DAQ/HLT system itself and the future development will concentrate on fault tolerance and stability.
        Speaker: Mrs Doris Burckhart (CERN)
        Paper
        Slides
      • 337
        Physics and Data Quality Monitoring at CMS
        The Physics and Data Quality Monitoring framework (DQM) aims at providing a homogeneous monitoring environment across various applications related to data taking at the CMS experiment. Initially developed as a monitoring application for the 1000 dual-CPU box (High-Level) Trigger Farm, it quickly expanded its scope to accommodate different groups across the experiment. The DQM organizes the information received by a number of monitoring producers and redirects it to monitoring consuming clients according to their subscription requests, in the classic publish-subscribe paradigm. Special care has been given to the modularity and stability of the system with the clear separation of the production of the monitoring information from the distribution and processing. We will describe the features of the DQM system and report on first measurements of its performance on a small subfarm prototype.
        Speaker: Dr Christos Leonidopoulos (CERN)
        Paper
        Slides
        Source files
    • Software Components and Libraries: SCL-5 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 338
        New features in ROOT geometry modeller for representing non-ideal geometries
        HEP experiments have generally complex geometries that have to be represented and modelled for several purposes. The most important are simulation and reconstruction, where people generally do rely on some "ideal" geometry representation that is modelled within the simulation framework. The problem that the "real" experiment geometry contains perturbations to this "perfectly aligned" model that needs to be taken into account. We will present the ongoing efforts within ALICE and ROOT frameworks to provide support for dealing with mis-alignment information at the level of ROOT geometry modeller.
        Speaker: Rene Brun (CERN)
        Slides
      • 339
        Geometry Description Markup Language and its application-specific bindings
        The Geometry Description Markup Language (GDML) is a specialised XML-based language designed as an application-independent persistent format for describing the detector geometries. It serves to implement 'geometry trees' which correspond to the hierarchy of volumes a detector geometry can be composed of, and to allow to identify the position of individual solids, as well as to describe the materials they are made of. Being pure XML, GDML can be universally used, and in particular it can be considered as the format for interchanging geometries among different applications. The GDML files can be either written by hand (meaning that GDML is used as the primary geometry description) or, in case of already existing geometry implementations (in Geant4, Root, etc), generated automatically using dedicated 'GDML writers'. In order to use GDML geometry files in specific applications, 'GDML processors' (based on Sax parser) have been implemented for Geant4 or Root. In this paper we will present the current status of the development of GDML. After having discussed the contents of the latest GDMLSchema, which is the basic definition of the format, we will concentrate on the GDML processors. We will present the latest implementation of the GDML 'writers' as well as 'readers' for either Geant4 or Root. Finally, we will also briefly present plans for the future development of GDML.
        Speaker: Dr Witold Pokorski (CERN)
        Paper
        Slides
      • 340
        Using XML for Detector Geometry Description in the Virtual Monte Carlo Framework
        The STAR Collaboration is currently migrating its simulation software based on Geant3, to the Root-based Virtual Monte Carlo Framework. One critical component of the framework is the mechanism of the Geometry Description, which comprises both the geometry model as used in the application, and the external language that allows the users to define and maintain the detector configuration on the ongoing basis, throughout the lifetime of the experiment. Having chosen the Root geometry library as the platform for the geometry model, we have elected to employ a structured and platform-neutral (i.e. not constrained to Root as a target platform) mechanism of the geometry description based on XML. To this end and as starting point, we chose to follow a path of re-using available components and work such as the AGDD schema and GraXML application. Enhanced from its initial concept, we created a parser based on the GraXML application ideas. In these approach, the structured geometry description written in XML in compliance with the extended AGDD schema gets converted into C++ code, suitable for input into the Root system and Virtual Monte Carlo framework based on it. In this talk, we will present the features of the enhanced schema, the technology and technique behind XML to C++ parser and TGeo geometry creator as well as our experience in utilizing the system. We will illustrate our work with several examples using independent tools for viewing andvalidating the final geometry.
        Speaker: Dr Maxim POTEKHIN (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
      • 341
        Surface contours and shapes of Super Heavy Elements (SHE)
        The enormity of data obtained in scientific experiments often necessitates a suitable graphical representation for analysis. Surface contour is one such graphical representation which renders a pictorial view that aids in easy data interpretation. It is essentially a two-dimensional visualization of a three-dimensional surface plot. Very recently, it has been shown that Super Heavy Elements can exist in a variety of shapes - spherical, spheroidal and ellipsoidal with or without shape co-existence. The shapes of such nuclei as predicted by us by diagonalizing the triaxial Nilsson Hamiltonian in cylindrical representation and using the Strutinsky-BCS corrections are graphically displayed by surface contours with Origin software. The obtained results are highly useful in the analysis of the stability of the Super Heavy Elements. Further, they yield a surprising result that the doubly magic spherical nucleus after lead (Z=82 and N=126) is SHE (Z=126 and N=184) in the macroscopic-microscopic method itself.
        Speaker: Ms Niranjani S (Department of Information Technology, Mohamed Sathak A.J. College of Engineering, 43, Old Mahabalipuram Road, Sipcot IT Park, Egatur, Chennai - 603 103, India.)
        Paper
        Slides
    • Software Tools and Information Systems: STIS-5 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 342
        HyperNews - managing discussions in HEP
        In the increasingly distributed collaborations of today's experiments, there is a need to bring people together and manage all discussions. The main ways for doing this on-line are the use of e-mail or web forums. HyperNews is a discussion management system which bridges these two, by including the use of e-mail for input, but also archiving the discussions in easy to access web pages. The discussions are divided into separate forums, and within these forums the discussions are divided into threads, allowing for rapidly finding a discussion and seeing how the discussion progressed. A search engine has also been added to the system to provide for fast access of appropriate postings based on content. This system has been in constant use for the past nine years in the BaBar experiment, and has archived discussions during this time in a quarter of a million postings. The experience of BaBar has shown that such a system is needed for current experiments -- to increase communication; archive discussions; and provide information. New deployments of HyperNews continue, including strong interest and deployments by LHC experiments. Experiences on the use of the system will be discussed, and information on the use of the system for any experiment will be provided. For more information and to download code for use see: http://hypernews.slac.stanford.edu.
        Speaker: Dr Douglas Smith (STANFORD LINEAR ACCELERATOR CENTER)
        Slides
      • 343
        Parallel computing studies for the alignment of the ATLAS Silicon tracker
        The silicon system of the ATLAS Inner Detector consists of about 6000 modules in its Semiconductor Tracker and Pixel Detector. Therefore, the offline global fit alignment algorithm has to deal with solving a problem of up to 36000 degrees of freedom.32-bit single-CPU platforms were foreseen to be unable to handle such large-size operations needed by the algorithm. The proposed solution is to utilize a Beowulfcluster with a 64-bit architecture. We have performed the initial studies on performance of such a system using SCARF RAL cluster, compared with earlier predictions, obtained the first promising results on parallel computing for the ATLAS tracker alignment. After a brief introduction with the motivation, we will describe the hardware and software used and present the results of the studies, using also examples from the ATLAS simulated data.
        Speaker: Dr Muge Karagoz Unel (University of Oxford)
        Paper
        Slides
      • 344
        Tools for the Study and Performance Optimization of the ATLAS High Level Trigger.
        In this presentation we will discuss the design and functioning of a new tool that runs the ATLAS High Level Trigger Software on Event Summary Data (ESD) files, the format for data analysis in the experiment. An example of how to implement a sequence of algorithms based on the electrons selection will be shown.
        Speaker: Dr Cibran Santamarina Rios (European Organization for Nuclear Research (CERN))
        Paper
        Slides
      • 345
        The Web Lecture Archive Project
        The size and geographical diversity of the LHC collaborations present new challenges for communication and training. The Web Lecture Archive Project (WLAP), a joint project between the University of Michigan and CERN Academic and Technical Training, has been involved in recording, archiving and disseminating physics lectures and software tutorials for CERN and the ATLAS Collaboration since 1999, when WLAP first recorded the prestigious CERN Summer Student Lectures and made them available as online Web Lectures. Ongoing demand for the recording of software tutorials, high energy physics workshops and general interest talks has driven our team to automate more and more of the recording, archiving, metadata tagging and publishing processes, in order to make possible the large-scale recording and dissemination of lectures with minimal human intervention. We have developed hardware and software solutions to automate the encoding and compression of audio, video and slides; defined a Lecture Object standard to facilitate the archiving and sharing of multimedia presentations in an open fashion, worked on a robotics camera tracking system to remove the need for a camera operator in tracking speakers, developed software to harvest text from captured slides, associating the resulting metadata with relevant sections within a lecture and radically improving search capabilities. Our group regularly records ATLAS and University of Michigan events, hosting a significant archive of hundreds of lectures for these communities, while simultaneously benefiting from each recording as a test bed for newly developed technologies. We present an overview of the project, with an emphasis on technologies currently in development.
        Speakers: Mr Jeremy Herr (University of Michigan), Dr Steven Goldfarb (University of Michigan)
        Paper
        Slides
    • 15:30
      Tea Break
    • Computing Facilities and Networking: CFN-6 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 346
        Benchmarking AMD64 and EMT64
        We report on the ongoing evaluation of new 64 Bit processors as they become available to us. We present the results of benchmarking these systems in various operating modes and also measured the power consumption. To measure the performance we use HEP and CMS specific applications including: the analysis tool ROOT (C++), the MonteCarlo generator Pythia (FORTRAN), OSCAR (C++) the GEANT 4 based CMS detector simulation program and ORCA(C++) the CMS event reconstruction program. Processors we tested include: single and dual core AMD Opteron AMD64 processors at various clock speeds Intel Xeon EMT64 processors AMD Athlon AMD64 processors
        Speaker: Dr Hans Wenzel (FERMILAB)
        Paper
        Slides
      • 347
        Introduction of a Content Management System in a HEP environment
        Taking the implementation of ZOPE/ZMS at DESY as an example we will show and discuss various approaches and procedures to introduce a Content Management System in a HEP Institute. We will show how requirements were gathered to make decisions regarding software and hardware. How existing Systems and management procedures needed to be taken into consideration. How the project was originally staffed and how that changed over time due to unforeseen requirements. How the approach of iterative (rapid) development paid out in terms of acceptance and requirement management. How Acceptance was considered right from the beginning by organization of the project, information and involvement of target groups, open and closed development cycles and very early use of the system for it's intended tasks. How requirements special to the HEP environment were taken into consideration through all steps of the project. How ongoing business with testing, usersupport and further development was planned and organized. We will not present too much technical details but a working solution will be shown and a short discussion of underlying server-systems and inclusion of surrounding DESY applications, like the central registry, will be done. There will be practical advice regarding setting up a CMS project and quickly get to desired results.
        Speaker: Mr Carsten Germer (DESY IT)
        Slides
      • 348
        Networks for ATLAS Trigger and Data Acquisition
        The ATLAS experiment will rely on Ethernet networks for several purposes. A control network will provide infrastructure services and will also handle the traffic associated with control and monitoring of trigger and data acquisition (TDAQ) applications. Two independent data networks (dedicated TDAQ networks) will be used exclusively for transferring the event data within the High Level Trigger and Data Acquisition system, all the way from detector read-out to mass storage. This article presents a networking architecture solution for the whole ATLAS TDAQ. While the main requirements for the control network are connectivity and resiliency, the data networks need to provide high-bandwidth, high-quality transfers with minimal packet loss and latency. As the networks size is large -- O(1000) end-nodes -- we propose to use a multilayer topology, with an aggregation layer (typically at rack level) and a core layer. In order to achieve high resiliency, we propose to distribute the core of each network on multiple devices interconnected via high-speed links, and to deploy a protocol that efficiently uses redundant traffic paths. In addition, geographical aspects (e.g. distances requiring optical fibre instead of copper) are addressed. The proposed network architecture will be mapped on typical commercial devices. Sample performance evaluation results, meant to validate the data network from the pre-series system (a 10% slice of the final TDAQ system). Traffic patterns similar to the ones created by real applications have been used to determine the network performance under TDAQ specific conditions.
        Speaker: Dr Stefan Stancu (University of California, Irvine)
        Paper
        Slides
      • 349
        gPLAZMA (grid-aware PLuggable AuthoriZation MAnagement): Introducing RBAC (Role Based Access Control) Security in dCache
        We introduce gPLAZMA (grid-aware PLuggable AuthoriZation MAnagement) Architecture. Our work is motivated by a need for fine-grain security (Role Based Access Control or RBAC) in Storage Systems, and utilizes VOMS extended X.509 certificate specification for defining extra attributes (FQANs), based on RFC 3281. Our implementation, the gPLAZMA module for dCache, introduces Storage Authorization Callouts for SRM and GridFTP. It allows using different authorization mechanisms simultaneously, fine-tuned with switches and priorities of mechanisms. Of the four mechanisms currently supported, one is an integration with RBAC services in the OSG Privilege Project, others are built-in as a lightweight suite of services (gPLAZMAlite Services Suite) including the legacy dcache.kpwd file, as well as the popular grid-mapfile, augmented with a gPLAZMAlite specific RBAC mechanism. Based on our current work, we also outline a future potential towards authorization for storage quotas. This work was undertaken as a collaboration between PPDG Common, OSG Privilege project, and the SRM-dCache groups at DESY, FNAL and UCSD.
        Speaker: Abhishek Singh RANA (University of California, San Diego, CA, USA)
        Paper
        Slides
    • Distributed Data Analysis: DDA-6 D406

      D406

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 350
        From rootd to Xrootd, from physical to logical files: experience on accessing and managing distributed data.
        With its increasing data samples, the RHIC/STAR experiment has faced a challenging data management dilemma: solutions using cheap disks attached to processing nodes have rapidly become economically beneficial over standard centralized storage. At the cost of data management, the STAR experiment moved to a multiple component locally distributed data model rendered viable by the introduction of a scalable replica catalog, and a home-brewed data replication and data management system. Access to the data was then provided via the rootd TNetfile integrated API-based. However, the reliability of the system has its flaws and STAR has moved to an Xrootd based infrastructure, Initially used in the Babar experiment for its data management, STAR and BNL, with its 650 nodes deployment, has to the largest use of Xrootd data-servers in the world. We will report in this paper our model and configuration, explain the previous and current approach and the reasons for the migration as well as resenting our experience in deploying and testing Xrootd. Finally, we will introduce our plan toward a full grid solution and the future incarnation of our approach: the merging of two technologies and the best of two worlds – Xrootd and SRM. This will enable the dynamic management of disk storage at the xrootd nodes, as well as provide transparent access to remote storage nodes, including mass storage systems.
        Speaker: Mr Pavel JAKL (Nuclear Physics Inst., Academy of Sciences - Czech Republic)
        Paper
        Slides
      • 351
        Latencies and data access. Boosting the performance of distributed applications.
        The latencies induced by network communication often play a big role in reducing the performances of systems which access big amounts of data in a distributed environment. The problem is present in Local Area Networks, but in Wide Area Networks is much more evident. It is generally perceived as a critical problem which makes very difficult to get access to remote data. However, a more detailed analysis on the access pattern of the involved applications can be used to understand the characteristics of the stream of the data requests, and develop techniques to optimize it. This work started from the analysis of the access patterns of the BaBar experiment's physics analysis data, but the methods and the results can be applied in other computing environments as well. We show how the exploit of caching and asynchronous prefetching techniques is able to enhance the performances of such kind of applications in Local Area Networks, and is able to lower the total latencies for Wide Area Networks data access of an order of magnitude. Moreover, the ability to process file open requests in parallel can be a very interesting performance enhancement for applications which need access to many files at once. Such general techniques have been implemented in the client side of the xrootd data access system, which, for its performances and its fault tolerant architecture, showed itself as an ideal testbed for such kind of enhancements.
        Speaker: Mr Fabrizio Furano (INFN sez. di Padova)
        Paper
        Slides
      • 352
        BaBar Bookkeeping - experience and use.
        For the BaBar Computing Group: Two years ago, the BaBar experiment changed its event store from an object oriented database system, to one based on ROOT files. A new bookkeeping system was developed to manage the meta-data of these files. This system has been in constant use since that time, and has successfully provided the needed meta-data information for users' analysis jobs, data management, and data distribution. This meta-data is stored in distributed databases, which can be hosted at any BaBar computing site, using either Oracle or MySQL. The system has performed well with the increasing data volume from all production efforts, and ever-growing number of analysis and production sites that host their own mirrors of the databases. Meta-data driven data export to computing sites throughout our distributed collaboration will also be discussed. Code developed for this system has also been shown to work well for other tasks within BaBar, and common-use tools will be described. The system still performs well after years of use, and should work fine and scale for the life of the experiment. The use and experience of this system within BaBar will be discussed, along with recent developments to make things better.
        Speaker: Dr Douglas Smith (STANFORD LINEAR ACCELERATOR CENTER)
        Slides
      • 353
        Performance and Scalbility of xrootd
        When the BaBar experiment transitioned to using the Root Framework s new data server architecture, xrootd, was developed to address event analysis needs. This architecture was deployed at SLAC two years ago and since then has also been deployed at other BaBar Tier 1 sites: IN2P3, INFN, FZK, and RAL; as well as other non-BaBar sites: CERN (Alice), BNL (Star), and Cornell (CLEO). As part of the deployment, extensive and rigorous performance and scalibility measurements were performed. This paper describes those measurements and shows how the results indicate that xrootd is an ideal platform for low latency high performance data access; as well as its future role in memory-based data access architectures.
        Speaker: Andrew Hanushevsky (Stanford Linear Accelerator Center)
        Slides
    • Distributed Event production and Processing: DEPP-6 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 354
        Long-term Experience with Grid-based Monte Carlo Mass Production for the ZEUS Experiment
        The detector and collider upgrades for the HERA-II running at DESY have considerably increased the demand on computing resources for Monte Carlo production for the ZEUS experiment. To close the gap, an automated production system capable of using Grid resources has been developed and commissioned. During its first year of operation, 400 000 Grid jobs were submitted by the production system. Using more than 30 Grid sites (LCG and Grid2003), 350 million events were simulated and reconstructed on the Grid. We will present the production setup and its implementation which is based on the ZEUS Grid-toolkit. Our setup includes an elaborate system to monitor the participating sites and every submitted Grid job. This system enables us to identify sources of failures and bottlenecks quickly and take the appropriate actions. We will describe this monitoring system and analyze the efficiency and typical failure modes of the current Grid infrastructure using the collected data. With the attained expertise the Grid can be used efficiently by the presented Monte Carlo production system and is now the major source of Monte Carlo events for physics analyses within the ZEUS collaboration.
        Speaker: Dr Hartmut Stadie (Deutsches Elektronen-Synchrotron (DESY), Germany)
        Slides
      • 355
        LCG 3D project status and production plans
        The LCG Distributed Deployment of Databases (LCG 3D) project is a joint activity between LHC experiments and LCG tier sites to co-ordinate the set-up of database services and facilities for relational data transfers as part of the LCG infrastructure. The project goal is to provide a consistent way of accessing database services at CERN tier 0 and collaborating LCG tier sites to achieve a more scalable and available access to non-event data (eg conditions, geometry, bookkeeping and possibly event level meta data). Further goals include the co-ordination of the requirement discussions between sites and experiments and to facilitate the technical exchange between database service providers at online, tier 0 and tier 1 sites. This contribution will discuss the outcome the first year of requirement discussions and technology tests with the proposed distribution technologies (Streams and FroNtier). We will also give a summary the definition of the production set-up for the first 6 months of operation as part of LCG service challenges.
        Speaker: Dr Dirk Duellmann Duellmann (CERN IT/LCG 3D project)
      • 356
        ATLAS Tier-0 Scaling Test
        To validate its computing model, ATLAS, one of the four LHC experiments, conducted in Q4 of 2005 a Tier-0 scaling test. The Tier-0 is responsible for prompt reconstruction of the data coming from the event filter, and for the distribution of this data and the results of prompt reconstruction to the tier-1s. Handling the unprecedented data rates and volumes will pose a huge challenge on the computing infrastructure. In this paper we report on our experiences in an attempt to scale up to nominal operation over a period of two months.
        Speaker: Miguel Branco (CERN)
        Paper
        Slides
      • 357
        Lessons from ATLAS DC2 and Rome Production on Grid3
        We describe experiences and lessons learned from over a year of nearly continuous running of managed production on Grid3 for the ATLAS data challenges. Two major phases of production were peformed: the first, large scale GEANT based Monte Carlo simulations ("DC2") were followed by extensive production for the ATLAS "Rome" physics workshop incorporating several new job types (digitization, reconstruction, pileup and user analysis). We will describe the systems used to run production on such a massive scale, which involved over 20 Grid3 sites, which successfully completed over 250k jobs and produced over 50TB of physics data. The production system consisting of a supervisor, executor and data management system will be described. Analysis of performance of various systems will be presented. Several critical points of failure were uncovered including scalability of Grid services for job submission and reliable file transfer, and gaining access to remote resources efficiently. These lessons have been incorporated into the design principles for the next generation production system, Panda.
        Speaker: Dr James Shank (Boston University)
        Slides
      • 358
        Distributed Data Management in CMS
        Within 5 years CMS expects to be managing many tens of petabytes of data in tens of sites around the world. This represents more than orderof magnitude increase in data volume over existing HEP experiments. This presentation will describe the underlying concepts and architecture of the CMS model for distributed data management, including connections to the new CMS Event Data Model. The technical descriptions of the main data management components for dataset bookkeeping, data location and file access will be presented. In addition we will present our first experience in using the system in preparation for a CMS data challenge in summer 2006.
        Paper
        Slides
      • 359
        The CMS Computing Model
        (For the CMS Collaboration) Since CHEP04 in Interlaken, the CMS experiment has developed a baseline Computing Model and a Technical Design for the computing system it expects to need in the first years of LHC running. Significant attention was focused on the development of a data model with heavy streaming at the level of the RAW data based on trigger physics selections. We expect that this will allow maximum flexibility in the use of distributed computing resources. The CMS distributed Computing Model includes a Tier-0 centre at CERN, a CMS Analysis Facility at CERN, several Tier-1 centres located at large regional computing centres, as well as many Tier-2 centres. The workflows involving these centres have been identified, along with baseline architectures for the data management. This presentation will describe the computing and data model, give an overview of the technical design and describe the current status of the CMS computing system.
        Speaker: Dr Jose Hernandez (CIEMAT)
        Paper
        Slides
    • Event Processing Applications: EPA-6 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 360
        Track reconstruction in high density environment
        Tracks finding and fitting algorithm in ALICE barrel detectors, Time projection chamber (TPC), Inner Tracking System (ITS), Transition radiation detector (TRD) based on the Kalman-filtering are presented. The filtering algorithm is able to cope with non-Gaussian noise and ambiguous measurements in high-density environments. The approach have been implemented within the ALICE simulation/reconstruction framework (ALIROOT), and algorithm's efficiency have been estimated using the ALIROOT Monte Carlo data.
        Speaker: Marian Ivanov (CERN)
        Paper
        Slides
      • 361
        High Energy Physics Event Selection with Gene Expression Programming
        Evolutionary Algorithms, with Genetic Algorithms (GA) and Genetic Programming (GP) as the most known versions, have a gradually increasing presence in High Energy Physics. They were proven successful in solving problems such as regression, parameter optimisation and event selection. Gene Expression Programming (GEP) is a new evolutionary algorithm that combines the advantages of both GA and GP, while overcoming some of their individual limitations. An analysis of GEP applicability to High Energy Physics event selection will be presented. The description of the technique, the results of its application to specific physics processes and the algorithm performances will be discussed.
        Speaker: Dr Liliana Teodorescu (Brunel University)
      • 362
        Access to Non-Event Data for CMS
        In order to properly understand the data taken for an HEP Event, information external to the Event must be available. Such information includes geometry descriptions, calibrations values, magnetic field readings plus many more. CMS has chosen a unified approach to access to such information via a data model based on the concept of an 'Interval of Validity', IOV. This data model is organized into Records which hold data that have the same IOV and an EventSetup which holds all Records whose IOV overlaps with the Event that is being studied. The model also allows dependencies between Records and guarantees that child Records have IOVs which are intersections of the parent Records' IOVs. The implementation of this model allows the data from a Record to either be created from a persistent store (such as a database) or from an algorithm, where the choice is made by the physicist at job configuration time. The client code that uses the data from a Record is completely uneffected (relinking is not even necessary) by the mechanism used to create the data.
        Speaker: Dr Christopher Jones (CORNELL UNIVERSITY)
        Paper
        Slides
      • 363
        The LHCb Alignment framework
        The LHCb alignment framework allows clients of the LHCb detector description software suite (DetDesc) to modify the position of components of the detector at run-time and see the changes propagated to all users of the detector geometry. DetDesc is used in the simulation, digitization and reconstruction phases of data processing and the alignment framework is available in all these stages. The alignment framework provides a very easy way to implement deviations from the LHCb design alignment, allowing tasks such as reconstruction with a detector with off-nominal alignment and misalignment simulation studies. It also allows for tasks that require a dynamic change in the alignment constants of the detectors, like iterative alignment algorithms. Components of the LHCb detector will be re-aligned after each fill, and the changes in the alignment constants will be required for on-line and off-line reconstruction. The alignment framework is deeply coupled to the LHCb Conditions Database, adding automatic time-dependent updating of alignment information to all LHCb software. The requirements, design and implementation of the Alignment framework shall be presented and illustrated with results from performance studies.
        Speaker: Dr JUAN PALACIOS (CERN)
        Slides
      • 364
        Implementation of a global fit method for the alignment of the Silicon Tracker in ATLAS Athena framework
        The ATLAS Inner Detector is composed of a pixel detector (PIX), a silicon strip detector (SCT) and a Transition radiation tracker (TRT). The goal of the algorithm is to align the silicon based detectors (PIX and SCT) using a global fit of the alignment constants. The total number of PIX and SCT silicon modules is about 35000, leading to many challenges. The current presentation will focus on the infrastructure of the processing part of the algorithm, leaving the technical issues related to the solving of a system with large number of degrees of freedom to a separate presentation. The main functionalities of the code will be presented and basic analysis of test beam data and simulation will be shown. An alternative method is also briefly discussed.
        Speaker: Adlene Hicheur (Particle Physics)
        Paper
        Slides
      • 365
        COCOA: General purpose software for simulation and reconstruction of optical alignment systems
        We describe a C++ software that is able to reconstruct the positions, angular orientations and internal optical parameters of any optical system described by a seamless combination of many different types of optical objects. The program also handles the propagation of uncertainties, what makes it very useful to simulate the system in the design phase. The software is currently in use by the four optical alignment systems of CMS and it is integrated in the CMS framework, so that it can read the geometry description from simple text files or the CMS XML format and the input and output data from text files or an Oracle database.
        Speaker: Pedro Arce (Cent.de Investigac.Energeticas Medioambientales y Tecnol. (CIEMAT))
        Paper
        Slides
    • Grid Middleware and e-Infrastructure Operation: GMEO-6 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 366
        Implementing Finer Grained Authorization on the Open Science Grid
        Securely authorizing incoming users with appropriate privileges on distributed grid computing resources is a difficult problem. In this paper we present the work of the Open Science Grid Privilege Project which is a collaboration of developers from universities and national labs to develop an authorization infrastructure to provide finer grained authorization consistently to all grid services on a site or domain. The project supports the utilization of extended proxy certificates generated with identity, group and role information from the European Data Grid (EDG) Virtual Organization Management System (VOMS). These proxies are parsed at the grid interface and an authorization request is sent a central Grid User Mapping Service (GUMS). The GUMS service will return the appropriate mapping based on the identity, role or group. This allows the user to propagate information about affiliation and activity in the credentials and allows the site to make decisions on authorization, privilege, and priority based on this information. The Privilege components have been packaged and deployed on OSG sites. The infrastructure has been used to support sites with multiple computing elements and storage elements. We will present the motivation and architecture for finer grained authorization as well as the deployment and operations experience.
        Speaker: Abhishek Singh Rana (UCSD)
        Slides
      • 367
        GRATIA, a resource accounting system for OSG
        We will describe the architecture and implementation of the new accounting service for the Open Science Grid. Gratia's main goal is to provide the OSG stakeholders with a reliable and accurate set of views of the usage of ressources across the OSG. Gratia implements a service oriented, secure framework for the necessary collectors and sensors. Gratia also provides repositories and access tools for the reporting and analysis of the grid wide accounting information.
        Speaker: Mr Philippe Canal (FERMILAB)
        Slides
      • 368
        DIRAC Security Infrastructure
        DIRAC is the LHCb Workload and Data Management System and is based on a service-oriented architecture. It enables generic distributed computing with lightweight Agents and Clients for job execution and data transfers. DIRAC code base is 99% python with all remote requests handled using the XML-RPC protocol. DIRAC is used for the submission of production and analysis jobs by the LHCb collaboration. The current experience has shown peaks over five thousand concurrent jobs. Originally there was no security layer within DIRAC itself. In order to better conform with the requirements of distributed analysis a DIRAC security infrastructure has been designed for generic XML-RPC transport over a SSL tunnel. This new security layer is able to handle standard X509 certificates as well as grid-proxies to authenticate both sides of the connection. Server and client authentication relies on OpenSSL and pyOpenSSL, but to be able to handle grid-proxies it was necessary to introduce some modifications to those libraries. The DIRAC security infrastructure handles all authorization internally hence the programmer only has to define the authorization level required for accessing each method exposed by the server.
        Speaker: Mr Adrian Casajus Ramo (Departamento d' Estructura i Constituents de la Materia)
        Paper
        Slides
      • 369
        The Virtual Organization Management Registration Service
        Currently, grid development projects require end users to be authenticated under the auspices of a "recognized" organization, called a Virtual Organization (VO). A VO establishes resource-usage agreements with grid resource providers. The VO is responsible for authorizing its members and optionally assigning them to groups and roles within the VO. This enables fine-grained authorization at grid sites as end users can be assigned grid computing privileges according to their VO group/role. The Virtual Organization Management Registration Service (VOMRS), developed at Fermilab, provides a comprehensive set of services that facilitates management of VO membership and privileges. It implements a registration workflow that requires email verification of identity, VO usage policy acceptance, membership approval by designated VO representatives/administrators, and allows for management of multiple grid certificates, and the selection of group and role. VOMRS maintains a VO membership status and a certificate level status for each member who is managed by the VO administrators, allowing for VO-level control of a member's privileges and membership. VOMRS provides a subscription service; email notifications are sent when selected changes are made to information about a member's VO membership status and/or when actions are required by members or administrators. VOMRS is capable of interfacing to local systems with personnel information (e.g., the CERN Human Resource Database), and pulling relevant member information from them. Such an interface can eliminate duplicate maintenance and be made to satisfy local security requirements. VOMRS membership data can be configured to synchronize with the VOMS system (developed jointly for DataTAG by INFN and for DataGrid by CERN) with all approved members' certificates and privileges. The current architecture and state of deployment will be discussed.
        Speaker: Mrs Tanya Levshina (FERMILAB)
        Paper
        Slides
      • 370
        Web servers for bulk file transfer and storage
        GridSite has extended the industry-standard Apache webserver for use within Grid projects, both by adding support for Grid security credentials such as GSI and VOMS, and with the GridHTTP protocol for bulk file transfer via HTTP. We describe how GridHTTP combines the security model of X.509/HTTPS with the performance of Apache, in local and wide area bulk transfer applications. GridSite also supports file location within storage farms, and we explain how this has been implemented within Apache using the HTCP protocol, and the client-side commands and toolkit we have provided for applications.
        Speaker: Dr Andrew McNab (UNIVERSITY OF MANCHESTER)
        Paper
        Slides
      • 371
        Meta-configuration for dynamic resource brokering: the SUMS approach
        In the distributed computing world of heterogeneity, sites may have from the bare minimum Globus package available to a plethora of advanced services. Moreover, sites may have restrictions and limitations which need to be understood by resource brokers and planner in order to take the best advantage of resource and computing cycles. Facing this reality and to take full advantage of any available site as well as local resources, we will present an approach implemented within the STAR Unified Meta-Scheduler (SUMS) framework. We will explain how the approach allows for self-consistency, that is, allows proper decision making at two sites using the same Meta-Scheduler configuration and software. We will explain how sites declare their configuration to the SUMS scheduler and how SUMS uses this information combined with policies to format jobs to tune to the strengths of a particular site. A specific example on how SandBox, a way by which required software are distributed to the computing element, will be explained as SUMS uses an abundance and versatile set of methods for pulling or retrieving files from a site.
        Speaker: Mr Levente HAJDU (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
    • Online Computing: OC-6 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 372
        A rule-based control and verification framework for ATLAS Trigger-DAQ
        In order to meet the requirements of ATLAS data taking, the ATLAS Trigger-DAQ system is composed of O(1000) of applications running on more than 2000 computers in a network. With such system size, s/w and h/w failures are quite often. To minimize system downtime, the Trigger-DAQ control system shall include advanced verification and diagnostics facilities. The operator should use tests and expertise of the TDAQ and detectors developers in order to diagnose and recover from errors, if possible automatically. The TDAQ control system is built as a distributed tree of controllers, where behavior of each controller is defined in a rule-based language allowing easy customization. The control system also includes verification framework which allow users to develop and configure tests for any component in the system with different levels of complexity. It can be used as a stand-alone test facility for a small detector installation, as part of the general TDAQ initialization procedure, and for diagnosing the problems which may occur during the run time. The system is currently being used in TDAQ commissioning at the ATLAS pit and by subdetectors for stand-alone verification of the hardware before it is finally installed. The paper describes the architecture and implementation of TDAQ control system with more emphasis on the new features developed for the verification framework, features requested by users during it's exploitation in real environment. Results from scalability tests performed in 2005 are also presented.
        Speaker: Andrei Kazarov (Petersburg Nuclear Physics Institute (PNPI))
        Paper
        Slides
      • 373
        The Log Service for the ATLAS Experiment
        This paper introduces the Log Service, developed at CERN within the ATLAS TDAQ/DCS framework. This package remedies the long standing problem of attempting to direct messages to the standard output and/or error in diskless nodes with no terminal. The Log Service provides a centralized mechanism for archiving and retrieving qualified information (Log Messages) created by TDAQ applications (Log Producers). One or multiple Log Servers form the system’s archival container, based on the MySQL database. A C++ interface is provided to access the Log Servers in a transparent manner. Messages can be inserted, retrieved and/or removed. Furthermore, a user- friendly web-based (PHP/HTML) tool is available to easily browse and/ore remove Log Messages. The development of these software components are described in this paper. Performance testing has been conducted within a controllable environment with up to ten Log Producers, two Log Servers and two Log Managers. The outcome has been crucial to identify the bottlenecks and constraints of the software and hardware infrastructure. Especially important has been the need to limit the Web Server connections to just one in order not to disturb the Log Message passing mechanism.
        Speaker: Dr Benedetto Gorini (CERN)
        Paper
        Slides
      • 374
        Online monitoring, calibration and reconstruction in the PHENIX experiment
        The PHENIX experiment took 2*10^9 CuCu events and more than 7*10^9 pp events during Run5 of RHIC. The total stored raw data volume was close to 500 TB. Since our DAQ bandwidth allowed us to store all events selected by the low level triggers, we did not filter events with an online processor farm which we refer to as level 2 trigger. Instead we ran the level 2 triggers offline in the PHENIX counting house on a local Linux cluster to select events for a priority reconstruction. These events were transferred to an offsite computing facility for fast reconstruction and analysis - which also provided important fast feedback in terms of achievable physics goals. In addition a subset of the minimum bias data was reconstructed immediately in the PHENIX counting house for other physics analysis and estimation of the level 2 trigger bias. This approach requires a fast availability of the calibrations which are necessary for the reconstruction. These calibrations are performed in parallel to the level 2 filtering effort under a common framework which provides access to events, database connectivity and keeps track of successes and failures. The resulting calibration constants are stored on a run by run basis in a PostgreSQL data base which is distributed to the offsite computing facilities. We will present the experiences of the PHENIX online computing for Run5 and the future developments and improvements for the upcoming Runs.
        Speaker: Dr Christopher Pinkenburg (BROOKHAVEN NATIONAL LABORATORY)
        Slides
      • 375
        Commissioning Procedures and Software for the CMS Silicon Strip Tracker
        The CMS silicon strip tracker (SST), comprising a sensitive area of over 200m2 and 10M readout channels, is unprecedented in its size and complexity. The readout system is based on a 128-channel analogue front-end ASIC, optical readout and an off-detector VME board, using FPGA technology, that performs digitization, zero suppression and data formatting before forwarding the detector data to the CMS online computing farm. Commissioning such a large-scale readout system requires sophisticated procedures that can optimally configure and synchronize the entire readout system and provide calibration parameters that are used by both hardware and the CMS reconstruction software. The software implementation for the commissioning procedures is divided between the CMS online and offline frameworks, known as XDAQ and CMSSW, respectively. Data acquisition loops for each of the commissioning tasks have been implemented within the XDAQ framework. These loops configure and control the readout system hardware and local trigger system, and perform event building using tools provided by the CMS TriDAS group. The data analysis modules, which receive the event data stream and calculate optimized hardware configurations and calibration constants, have been implemented within CMSSW. This design, using both the online and offline frameworks, ensures that the commissioning software is sufficiently flexible for online operation using either the local DAQ resources allocated to each sub-detector or the global resources of the online computing farm. The latter option offers significant improvements in detector readout speeds and CPU processing power, thus reducing turn-around times between physics runs. We present an overview of the SST data acquisition system, focusing on the commissioning procedures and their software implementations within the online and offline frameworks. Results and performance studies will also be presented, based on experiences gained when commissioning a complete slice of the SST readout system and during integration activities of the final silicon strip tracker.
        Speaker: Dr Robert Bainbridge (Imperial College London)
        Paper
        Slides
    • Software Components and Libraries: SCL-6 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 376
        ROOT 3D graphics
        We present an overview of the common viewer architecture (TVirtualViewer3D interface and TBuffer3D shape hierarchy) used by all 3D viewers. This ensures clients of the viewers are decoupled from the viewers, and free of specific drawing code. We detail progress on new OpenGL viewer - the primary development focus, including architecture (publish 'on demand' model, caching, native shapes, geometric shapes and particles/tracks), features (rendering styles, camera interactions, manipulators, clipping, embed in pad) and performance (memory management, culling, render quality).
        Speaker: Rene Brun (CERN)
        Paper
        Slides
      • 377
        Using the Qt to create the complex interactive HENP applications at STAR
        This talk presents an overview of the main components of a unique set of tools, in use in the STAR experiment, born from the fusion of two advanced technologies: the ROOT framework and libraries and the Qt GUI and event handling package. Together, they allow creating software packages and help resolving complex data-analysis or visualization problems, enhance computer simulation or help develop geometry models demonstrating that it is not only feasible but beneficial to integrate and benefit from the best of each technology. The core library and system has been under development for the last 4 years and has solidified and made into concrete use through several applications for the STAR experiment at BNL as well as ROOT plugins. As a result, STAR has been empowered with stable interactive applications for online control and offline data-analysis built on top of the STAR ROOT-based framework and this, while preserving the initial software components in a non disruptive manner. In fact, many components are not STAR specific and can be used elsewhere and used in the context of “user custom filters”, “detector geometry” components or even various 3D views for the High Energy and Nuclear Physics events. The portion of the project is already included in the official ROOT distribution and part of LCG binary distribution of the ROOT package for RHIC and LHC experiments. In full it is available from http://root.bnl.gov Web site.
        Speaker: Dr Valeri FINE (BROOKHAVEN NATIONAL LABORATORY)
        Slides
      • 378
        The V-Atlas Event Visualization Program
        We describe an event visualization package in use in ATLAS. The package is based upon Open Inventor and its HEPVIs extensions. It is integrated into ATLAS's analysis framework, is modular and open to user extensions, co-displays the real detector description/simulation (GeoModel/GEANT) geometry together with event data, and renders in real time on regular laptop computers, using their available graphics acceleration. The functionality requires no commercial software. It has been used to debug, extensively, the geometry of the ATLAS detector and is now being applied to commissioning activities.
        Speaker: Vakhtang Tsulaia (UNIVERSITY OF PITTSBURGH)
        Slides
      • 379
        Using Java Analysis Studio as an interface to the Atlas Offline Framework
        Huge requirements on computing resources have made it difficult to run Frameworks of some new HEP experiments on the users' personal workstations. Fortunately, new software technology allows us to give users back at least a bit of the user-friendliness they were used to in the past. A Java Analysis Studio (JAS) plugin has been developed, which accesses the Python API of the Atlas Offline Framework (Athena) over the XML-RPC layer. This plugin gives a user the full power of JAS over the resources otherwise only available within Athena. A user can access any Athena functionality and handle all results directly in JAS. Graphical adapters to some Athena services have been delivered to ease the access even further.
        Speaker: Dr Julius Hrivnac (LAL)
        Paper
        Slides
      • 380
        ALICE Event Visualization Environment
        ALICE Event Visualization Environment (AEVE) is a general framework for visualization of detector geometry and event-related data being developed for the ALICE experiment. Its design is guided by the large raw event size (80 MBytes) and an even larger footprint of a full simulation--reconstruction pass (1.5 TBytes). An extensible pre-processing mechanism needed to reduce the data volume, collect cumulative statistics, provide cross-indexing information and allow attachment of user-data is presented. Data-selection is described with an emphasis on the usage of advanced n-tuple management functionality of the ROOT framework (tree-friends and tree indices). Data-flow and data-management are discussed in view of application steering in a multi-threaded, input-limited environment. Overview of data-visualization and data-interaction layer is made and techniques used to maximize presentation-layer configurability are described. The article closes with a discussion of AEVE as a base for construction of a wide range of end-user applications ranging from expert debugging tools (read-out electronics, simulation and reconstruction code, detector performance monitoring) to general event-display programs.
        Speaker: Matevz Tadel (CERN)
        Paper
        Slides
      • 381
        Gled -- a ROOT based framework for distributed computing and dynamic visualization
        Gled is an OO research framework for fast prototyping of applications in distributed and multi-threaded envirnoments with support for direct data interaction and dynamic visualization. It is an extension of the ROOT framework and thus inherits its core features, including object serialization, versatile I/O infrastructure (files with inner directory structures, trees, rootd), CINT -- the C/C++ interpreter and a rich set of data analysis tools. Gled addresses the problems of concurrent resource access, remote data synchronization and remote method invocation (OO version of RPC) by defining a strict object data-model. This paper describes a set of paradigms governing object habitat, object life-cycle, object aggregation and methods for securing thread-safe object access. Hierarchical server--client model which is used to bind several computing nodes in a tree structure is discussed. Problems of distributed object-space management are presented and procedures used to achieve their synchronization explained. Overview of Gled support for multi-threaded execution of user-tasks is made. Thread-control mechanisms are disucussed in the context of distributed execution. OpenGL based 3D rendering infrastructure of Gled is briefly described with an emphasis on its capabilities for dynamic visualization and key-frame animation. Gled is discussed as a development framework. It is argued that only a thin wrapper over user-code is needed to instrument it with the presented capabilities of Gled.
        Speaker: Matevz Tadel (CERN)
        Paper
        Slides
    • Software Tools and Information Systems: STIS-6 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 382
        A Web Lecture Capture System with Robotic Speaker Tracking
        The major challenges preventing the wide-scale generation of web lecture recordings include the compactness and price of the required hardware, the speed of the compression and posting operations, and the need for a human camera operator. We will report on efforts that have led to major progress in addressing each of these issues. We will describe the design, prototyping and pilot deployment of an affordable web lecture capture device that is portable and robust and which accepts input from a speaker’s laptop without interfering with its projection onto a screen, and rapidly archives and posts the synchronized video, audio and slides onto the web. The system incorporates an infrared camera to provide automatic tracking of the speaker and thereby removes the need for a camera-operator. We will report on our laboratory tests of an array of available tracking technologies, the efficacies of each, and the performance of our current system. We will also report on the development of an automatic metadata extraction system so that date, time, keywords and other information can be harvested from each presentation and associated with the recorded lecture, and entered into a database that is optimally configured for global sharing. In addition, we will discuss a proposed global standard for an entity called the "Lecture Object" that would permit recorded lectures to be accessed and replayed by essentially any user for decades to come, independent of changes in commercial playback applications. Work on this project was supported with a grant from the U.S. National Science Foundation.
        Speakers: Mr Jeremy Herr (University of Michigan), Dr Steven Goldfarb (University of Michigan)
        Paper
        Slides
      • 383
        A COMPUTING RESOURCES ADMINISTARTION SYSTEM FOR CERN (CRA)
        CRA is a multi layered system with a web based front end providing centralized management and rules enforcement in a complex, distributed computing environment such as Cern. Much like an orchestra conductor CRA’s role is essential and multi functional. Account management, resource usage and consistency controls for every central computing service at Cern with about 75000 active accounts is one important task of the system. Enforcement of the organization’s rules and regulations on the usage of computing resources including strict security requirements in an environment with an ever moving population and changing services is another challenge CRA has addressed. In addition, the CRA application leverages its tight integration with the personnel system, to provide extra functionality like name reservation and dynamic email lists, allowing better coordination of the resources throughout the Organization. CRA’s lowest layer consists of an Oracle database for data storage and low level integrity controls. A database abstraction layer is provided by a set of Java classes and PL/SQL procedures. The interface for the end users has been implemented using Java generating dynamic html, and is based on the MVC architecture. Here, the Apache Java Struts framework provides most of the controller functionality while CRA actions implement the organization’s business rules and logic. Asynchronous messaging is used for communicating with the client systems.
        Speakers: Mr Bartlomiej Pawlowski (CERN), Mr Nick Ziogas (CERN), Mr Wim Van Leersum (Cern)
      • 384
        CERN Equipment management integrates Safety aspects
        Ensuring personnel and equipment safety under all conditions, while operating the complex CERN systems, is a vital condition for CERN success. By applying accurate operating and maintenance procedures as well as executing regular safety inspections, CERN has an excellent safety record. Regular safety inspections also permit the traceability of all important events that have occurred in the life of an equipment or an installation. Such traceability is a requirement of the Host states' safety regulations. The CERN Engineering Data Management System (EDMS) is the technical document and equipment management system for the LHC project. With EDMS it is today possible to have access to the design, manufacturing, testing and installation information of approximate 350’000 different LHC parts or LHC subsystems. The EDMS Service has now added features to this system to allow a complete integration of the technical data and the safety information. This paper presents a summary of the architecture of the system, its main functionalities, the benefits it brings, experience gained and some planned improvements. The paper also highlights the importance/need to define clearly from the beginning the roles and responsibilities of each party and to ensure the resources and organization required".
        Speaker: Mr Stephan Petit (CERN)
        Paper
        Slides
      • 385
        Dissemination of scientific results in High Energy Physics: the CERN Document Server vision.
        The traditional dissemination channels of research results, via article publishing in scientific journals, are facing a profound metamorphosis driven by the advent of the internet and broader access to electronic resources. This change is naturally leading away from the traditional publishing paradigm towards an archive-based approach in which institutional libraries organize, manage and disseminate the research output. Within this context, CERN has been committed since its early beginnings to the open divulgation of scientific results. The dissemination started by free paper distribution of preprints by CERN Library and continued electronically via FTP bulletin boards and the World Wide Web to the current OAI-compliant institutional repository, the CERN Document Server (CDS). By enforcing interoperability with peer repositories, like arXiv and KEK, CDS manages over 500 collections of data, consisting of over 800,000 bibliographic records in the field of particle physics and related areas, covering preprints, articles, books, journals, photographs and more. In this paper we discuss how the CERN Document Server is becoming a solid base for the collection and propagation of research results in high energy physics by implementing a range of innovative library management services. In particular, we focus on metadata extraction to create information-rich library objects and groupware and collaborative features that allow users to comment and review records in the repository. Moreover, we explain how the existing document ranking techniques, based on usage and citation statistics, may provide original insights on the impact of selected scholarly output.
        Speaker: Alberto Pepe (CERN)
        Paper
        Slides
    • Plenary: Plenary 7 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Manuel Delfino (PIC)
      • 386
        Grid Activities in Japan Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Ken Miura (National Institute of Informatics, Japan)
        Slides
      • 387
        Grid Activities in China Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Gang Chen (IHEP, Beijing)
        Slides
      • 388
        Grid Activities in Taiwan Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Simon Lin
        Slides
    • Poster: Poster 2
    • 10:30
      Tea Break
    • Plenary: Plenary 8 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India

      Plenary Session

      Convener: Guy Wormser (LAL, Orsay)
      • 389
        Grid computing in Medical applications Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Piergiorgio Cerello (INFN - TORINO)
        Paper
        Slides
      • 390
        Computing challenges for the Square Kilometer Array Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Mathai Joseph (Tata Research Development and Design Centre)
        Slides
      • 391
        Computing challenges in Lattice gauge QCD Auditorium

        Auditorium

        Tata Institute of Fundamental Research

        Homi Bhabha Road Mumbai 400005 India
        Speaker: Dr Rajiv Gavai (TIFR)
        Slides
    • 12:30
      Lunch Break
    • Computing Facilities and Networking: CFN-7 D405

      D405

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 392
        VINCI : Virtual Intelligent Networks for Computing Infrastructures
        To satisfy the demands of data intensive grid applications it is necessary to move to far more synergetic relationships between applications and networks. The main objective of the VINCI project is to enable data intensive applications to efficiently use and coordinate shared, hybrid network resources, to improve the performance and throughput of global-scale grid systems, such as those used in high energy physics and many other fields of science. VINCI uses a set of agent-based services implemented in the MonALISA framework to enable the efficient use of network resources, coordinated with computing and storage resources. VINCI is an integrated network service system that provides client authentification and authorization, discovery of services and the topology of connections, workflow scheduling, global optimization and monitoring. Our strategy in integrating applications with geographically distributed complex services that may join and existing ones leave is the use of a multi-agent systems. Agents will act on behalf of applications describing their environment and requirements, locating services, agreeing information, and receiving feedback from services and presenting results. Agents will be required to engage in interactions, to negotiate, and to make pro-active run-time decisions while responding to changes in the environment. In particular, agents will need to self organize, and dynamically collaborate for effective decisions. The distributed agent system can create on demand end to end optical connections in less than one second independent of the location and the number of optical switches involved. It monitors and supervises all the created connections and is able to automatically generate an alterative path in case of connectivity errors. The alternative path is set up rapidly enough to avoid a TCP timeout, and thus to allow the transfer to continue uninterrupted. Dedicated agents are used to monitor the client systems and to detect hardware and software configuration. They can perform end to end performance measurements and if necessary to configure the systems. We are developing agents able to interact with GMPLS controllers and integrate this functionality into the network services provided by the VINCI framework.
        Speaker: Iosif Legrand (CALTECH)
        Slides
      • 393
        Embedding Quattor into the Fabric Management Infrastructure at DESY
        DESY is one of the worlds leading centers for research with particle accelerators and synchrotron light. The computer center manages a data volume of the order of 1 PB and houses around 1000 CPUs. During DESY's engagement as Tier-2 center for LHC experiments these numbers will at least double. In view of these increasing activities an improved fabric management infrastructure is being established. In this context Quattor is used for the automatic installation and the configuration management. The DESY grid infrastructure has been used for a pilot project while current efforts integrate dCache systems, workgroup servers, and desktops. A variety of developments have arisen in this process: A standard for the configuration template management, which will allow a larger number of administrators to productively work on the fabric configuration, a management tool for configuration templates, which aids in keeping track of the various sources of templates, and an integration tool for collecting information from existing management systems at DESY. We will present these achievements together with our Quattor deployment experiences which motivated them.
        Speaker: Dr Mathias de Riese (DESY)
        Slides
      • 394
        The DESY-Registry – account management for many backend systems
        DESY operates some thousand computers, based on different operating systems. On Servers and workstations not only the operating systems but many centrally supported software systems are used. Most of these systems, operating and software systems come with their own user and account management tools. Typically they do not know of each other, which makes live harder for users, when you have to remember your different passwords for different systems, and which is against the efforts for effective user administration and support. The DESY-Registry is a 3-tier, web-based application, which centralizes the user and account management for about 30 centrally supported systems. Accounts and access to resources like operating systems (UNIX, Windows) and prominent software systems (Oracle, RAS/VPN) as well as “virtual” systems, like computing clusters, are managed. To enable de-centralized administration in a central system, the DESY-Registry offers a role-based delegation mechanism, where administrators are able to manage accounts of “their” users and the central support group is able to manage every account. A workflow mechanism facilitates delegation and automation takes care of account expiry an enforcement of regular password policies. Since January 2005 the DESY-Registry is productive. We present details of the project objectives and the solution as well as the experiences of one year of operation.
        Speaker: Mr Dirk Jahnke-Zumbusch (DESY)
        Paper
        Slides
    • Distributed Event production and Processing: DEPP-7 AG 80

      AG 80

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 395
        PhEDEx high-throughput data transfer management system
        Distributed data management at LHC scales is a staggering task, accompanied by equally challenging practical management issues with storage systems and wide-area networks. CMS data transfer management system, PhEDEx, is designed to handle this task with minimum operator effort, automating the workflows from large scale distribution of HEP experiment datasets down to reliable and scalable transfers of individual files over frequently unreliable infrastructure. Over the last year PhEDEx has matured to the point of handling virtually all CMS production data transfers. CMS pushes equally its own components to perform and the heavy investment into peer projects at all levels, from technical details to grid standards to world-wide projects, to ensure the endto-end service is of sufficient quality. We present the throughput and service quality we have reached in the current daily 24/7 production work, the steps taken in LCG service challenges for the next generation transfer service, and the resulting changes in performance. We also report results from our scalability stress tests on PhEDEx alone. We offer an analysis of transfer-related problems we have encountered and how they have been affecting CMS data management.
        Speaker: Jens Rehn (CERN)
        Paper
        Slides
      • 396
        Belle Monte Carlo Production on the Australian National Grid
        In 2004 the Belle Experimental Collaboration reached a critical stage in their computing requirements. Due to an increased rate of data collection an extremely large amount of simulated (Monte Carlo) data was required to correctly analyse and understand the experimental data. The resulting simulation effort consumed more CPU power than was readily available to the experiment at the host institution, KEK, Japan. In order to meet requirements the simulated data production was distributed to remote institutions who were able to contribute CPU power. The Australian Belle collaborators participated in this production successfully utilising resources at number of Australian facilities, including APAC (Australian Partnership for Advanced Computing), AC3 (Australian Centre for Advanced Computing and Communication), MARCC (Melbourne Advanced Research Computing Centre), and VPAC (Victorian Partnership for Advanced Computing). This production involved the use of a globally accessible data catalogue and resource management system, SRB (Storage Resource Broker), and tools developed in-house for the central dispatch, monitoring and management of jobs. The production was successfully deployed on the Australian APAC National Grid (APAC NG) infrastructure and is currently utilising the LHC Computing Grid middleware layer.
        Speaker: Marco La Rosa (University of Melbourne)
      • 397
        CMS experience in LCG SC3
        The most significant data challenge for CMS in 2005 has been the LCG service challenge 3 (SC3). For CMS the main purpose of the challenge was to exercise a realistic LHC startup scenario using complete experiment system, in what concerns transferring and serving data, submitting jobs and collecting their data, employing the next-generation world-wide LHC computing service. A number of significant new components were submitted to the test by LCG. Compared to the past data challenges a number of important parameters have been changed for CMS, and a number of improvements to software and systems we tested. We describe our benchmark goals and how the tests were performed. We report on the results achieved, consequences, and conclusions drawn.
        Speaker: Lassi Tuura (NORTHEASTERN UNIVERSITY, BOSTON, MA, USA)
        Slides
    • Event Processing Applications: EPA-7 AG 76

      AG 76

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 398
        OHP: An Online Histogram Presenter for the ATLAS experiment
        ATLAS is one of the four experiments under construction along the Large Hadron Collider ring at CERN. During the last few years much effort has gone in carrying out test beam sessions that allowed to assess the performance of ATLAS sub-detectors. During the data taking we have started the development of an histogram display application designed to satisfy the needs of all ATLAS sub-detectors groups. The requirements that have driven the design of this application are: possibility to display histograms produced by many different producers, possibility to use it both as a configurable presenter and as a browser, possibility to manage user actions on histograms, possibility to allow comparison with reference histograms, high integration with the ATLAS DAQ software, minimization of the network traffic between presenter and producers. A first prototype of this application has been implemented and extensively used at the 2004 ATLAS Combined Test Beam and, from this experience, an upgraded application has been developed to be used in the ATLAS commissioning in 2007. The presentation will describe the program architecture and its interactions with the ATLAS DAQ software, including the first results obtained from the performance tests.
        Speaker: Dr Andrea Dotti (Università and INFN Pisa)
        Paper
        Slides
      • 399
        Software for the CMS Cosmic Challenge
        At the end of 2004 CMS decided to redesign the software framework used for simulation and reconstruction. The new design includes a completely revisited event data model. This new software will be used in the first months of 2006 for the so called Magnet Test Cosmic Challenge (MTCC). The MTCC is a slice test in which a small fraction of all the CMS detection equipment is expected to be operated in the 4T solenoid of the experiment. Cosmic rays detected in the muon chambers will be used to trigger the readout of all detectors. Prior to data taking, the detectors and their readout electronics must be tuned and synchronized with dedicated software procedures. Local reconstruction must be carried out online and offline in all subdetectors to monitor and measure their performance. Global reconstruction, linking different subdetectors, is also expected to be attempted to study relative timings and positions. CMS Visualization, which allows reconstruction products attached to every detector element to be promptly accessed, is also expected to be used for validation purposes and monitoring.
        Speaker: Dr Giacomo Bruno (UCL, Louvain-la-Neuve, Belgium)
      • 400
        Event visualisation for the ATLAS experiment - the technologies involved
        We describe the design of Atlantis, an event visualisation program for the ATLAS experiment at CERN, and the other supporting applications within the visualisation project, mainly focusing on the technologies employed. The ATLAS visualisation consists of several parts with Atlantis being the central application. The main purpose of Atlantis is to help visually investigate and intuitively understand complete ATLAS events. Atlantis is a stand-alone graphical application written entirely in Java, using Java/Swing 2D API, XML parsers and Apache/XMLRPC for network communication with Athena, the ATLAS software framework. The event data, in XML format, is produced by a dedicated interface called JiveXML running within the Athena framework. Atlantis reads in the data either from files (offline mode) or via a network connection in the online mode of JiveXML. In the online mode, the data is transferred on request from a C++ XMLRPC server running within JiveXML to Atlantis acting as a XMLRPC client. The Atlantis user is also able to steer the Athena framework over a network connection directly from Atlantis. Atlantis makes remote calls to a XMLRPC Python server started at the interactive Athena Python prompt. This server receives the Athena commands and executes them as if typed locally.
        Speaker: Zdenek Maxa (University College London)
        Paper
        Slides
      • 401
        Application of data visualisation techniques in particle physics
        Visualisation of data in particle physics currently involves event displays, histograms and scatterplots. Since 1975 there has been an explosion of techniques for data visualisation driven by highly interactive computer systems and ideas from statistical graphics. This field has been driven by demands for data mining of large databases and genomics. Two key areas are direct manipulation of visual data, and new methods for visualising high-dimensional data. The first area has seen the use of linked views, brushing and pruning. The second area has seen the introduction of methods such as parallel coordinates and the grand tour. In this paper, these ideas are applied to particle physics data to evaluate their ability to reduce data analysis time and improve pattern recognition. In particular, parallel coordinates will be used to analyse a sample of K-short Monte Carlo events. It will be shown that this graphical technique significantly reduces the time taken to determine the key variables for event selection. This paper will also evaluate various publicly available software tools that include many of the new statistical graphics techniques. The paper will conclude that no single tool includes all the most powerful new techniques and argue that urgent work is required to integrate these ideas into data analysis tools for particle physics.
        Speaker: Prof. Stephen Watts (Brunel University)
        Slides
      • 402
        Data Quality Monitoring for the CMS Silicon Tracker
        The CMS silicon tracker, consisting of about 17,000 detector modules divided into micro-strip and pixel sensors, will be the largest silicon tracker ever realized for high energy physics experiments. The detector performance will be monitored using applications based on the CMS Data Quality Monitoring (DQM) framework and running on the High-Level Trigger Farm as well as local DAQ systems. The monitorable quantities of this large number of modules are divided into hierarchical structures reflecting the detector sections. In addition, they are organized into structures corresponding to the levels of data processing. The produced information are delivered to client applications according to their subscription requests. These applications summarize and visualize the received quantities. We describe here the functionalities of the CMS tracker DQM applications and report preliminary performance tests.
        Speakers: Dr Suchandra Dutta (Scuola Normale Superiore, INFN, Pisa), Dr Vincenzo Chiochia (University of Zurich)
        Paper
        Slides
        source latex file and figures
    • Grid Middleware and e-Infrastructure Operation: GMEO-7 Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 403
        DØ Data Reprocessing with SAM-Grid
        Periodically an experiment will reprocess data taken previously to take advantage of advances in its reconstruction code and improved understanding of the detector. Within a period of ~6 months the DØ experiment has reprocessed, on the grid, a large fraction (0.5fb-1) of the Run II data. This corresponds to some 1 billion events or 250TB of data and used raw data as input, requiring remote database access. This is the largest HEP grid activity and has been a great success. SAM (Sequential Access to Metadata) has been in operation at DØ since the start of Run II and provides the data-grid (also enabling remote analysis). Job submission and management is provided by JIM. Together they form the middleware SAM-Grid, used for this activity. This massive task led to extensive developments in SAM-Grid, in a joint effort between the core developers and those carrying out the reprocessing at the remote sites. The resources used, corresponding to some 3500 GHz equivalent, were shared and include LCG and OSG facilities. This activity, including the development of SAM-Grid and the operational tools and procedures developed will be presented. Lessons learnt from carrying out such a task on the grid will be discussed.
        Speaker: Dr Joel Snow (Langston University)
        Paper
        Slides
      • 404
        A Statistical Analysis of Job Performance within LCG Grid.
        The LCG is an operational Grid currently running at 136 sites in 36 countries, offering its users access to nearly 14,000 CPUs and approximately 8PB of storage [1]. Monitoring the state and performance of such a system is challenging but vital to successful operation. In this context the primary motivation for this research is to analyze LCG performance by doing a statistical analysis of the lifecycles of all jobs submitted to it. In this paper we define metrics that will describe typical job lifecycles. The statistical analysis of these metrics enables us to gain insight into the work load management characteristics of the LCG Grid [2]. Finally we show how those metrics can be used to spot Grid failures by identifying statistical changes over time in the monitored metrics. [1] GridPP-UK Computing for Particle Physics: http://www.gridpp.ac.uk/ [2] Crosby P, Colling D, Waters D, Efficiency of resource brokering in grids for high-energy physics computing, IEEE Transactions on Nuclear Science, 2004, Vol: 51, Pages: 884 - 891, ISSN: 0018-9499
        Speaker: Mrs Mona Aggarwal (Imperial College London)
        Paper
        Slides
      • 405
        Using Grid Technologies for Lattice QCD
        Numerical simulations of QCD formulated on the lattice (LQCD) require a huge amount of computational resources. Grid technologies can help to improve exploitation of these precious resources, e.g. by sharing the produced data on a global level. The International Lattice DataGrid (ILDG) has been founded to define the required standards needed for a grid infrastructure to be used for research on lattice QCD. In this talk we will discuss the requirements, problems, solutions and open issues related to putting a grid-of-grids into operation. We will in particular report on the implementation of a standard for metadata and a metadata catalogue. Furthermore, we will consider issues related to file catalogues, data management and access control. In this contribution we will focus on the experience of operating a LCG2-based grid infrastructure used by LQCD research groups in Europe.
        Speaker: Dr Dirk Pleiter (DESY)
        Paper
        Slides
      • 406
        The Medical Physics Simulation with Grid
        A new project for advanced simulation technology in radiotherapy was launched on Oct. 2003 with funding of JST (Japan Science and Technology Agency) in Japan. The project aim is to develop an ample set of simulation package for radiotherapy based on Geant4 in collaboration between Geant4 developers and medical users. They need much more computing power and strong security for accurate and high-speed dose calculation and therefore parallelism and gridify of Geant4 applications is an important issue for our project. Some of LCG (LHC Computing Grid) middlewares were actually deployed and enabled to share the computing resources and to control job between KEK and ICEPP. A class library was developped for parallelization of existent Geant4-based application as a part of Geant4 framework. One of our practical application was accelerated by 30 times uder 40 CPUs and then parallel efficiency is 67%. In this paper, we will describe how to design and implement parallelization of Geant4 medical applications and interface between our applications and Grid environment.
        Speaker: Go Iwai (JST)
    • Online Computing: OC-7 B333

      B333

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 407
        Planning for predictable network performance in the ATLAS TDAQ
        The Trigger and Data Acquisition System of the ATLAS experiment is currently being installed at CERN. A significant amount of computing resources will be deployed in the Online computing system, in the close proximity of the ATLAS detector. More than 3000 high-performance computers will be supported by networks composed of about 200 Ethernet switches. The architecture of the networks was optimised for the particular traffic profile generated by data transfer protocols with real-time delivery constraints. In this paper, we summarise the operational requirements imposed on the TDAQ networks. We describe the architecture of the network management solution that fulfils the complete set of requirements. We show how commercial and custom-developed applications will be integrated to provide a maximum of relevant information to the physics operator on shift and enable the networking team to analyse trends and predict the network performance. An active, application-driven, network reconfiguration service will facilitate a rapid partial network topology change with the aim of providing guarantees on the amount of traffic to be supported for a particular data acquisition role (data taking, calibration, monitoring).
        Speaker: Dr Catalin Meirosu (CERN and "Politehnica" Bucharest)
        Paper
        Slides
      • 408
        Relational Database Implementation and usage in STAR
        The STAR experiment at Brookhaven National Laboratory's Relativistic Heavy-Ion Collider (RHIC) has been accumulating 100's of millions events over its already 5 years running program. Within a growing Physics demand for statistics, STAR has more than doubled the events taken each year and is planning to increase its capability by an order of magnitude to reach billion events capabilities by 2008. Under such a rate stress imposed by the event rate, the run condition support and database back-end needed to rapidly mature to follow the demand while preserving user convenience and time evolution but also allow for in depth technology changes as required. In this talk, we will present the use of relational databases in STAR organized as a three tier architecture model: a front-end user interface, a middle tier homegrown C++ library (StarAPI) that handles all of the unique requirements arising from an active experiment, and finally, the lower level DBMS requirements and data storage. Paramount considerations include maintaining flexibility and scalability with modular construction and consistent namespace; ensuring long-term analysis integrity with three-dimensional time-stamping or range of validity which in turn allows for solid schema evolution; and ensuring uniqueness with expanded primary keys. We will identify and discuss trade-offs and challenges that have occurred during the evolution of our experiment, and specifically the challenge introduced by detectors which could only be described in terms of million leaves within an ultra-fine granularity of calibration values.
        Speaker: Mr Michael DePhillips (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
      • 409
        JCOP Framework Configuration Database Tool
        The control systems of the LHC experiments are built using the common commercial product: PVSS II (from the ETM company). The JCOP Framework Project delivers a set of common tools built on top of, or extending the functionality of, PVSS (such as the control for widely used hardware, a Finite State Machine (FSM) toolkit, access control management, cooling and ventilation application) which can be used by all LHC experiments. The Configuration Database Tool is a part of the JCOP Framework responsible for management of the configuration data. The tool manages versions of system, static and configuration data, and uses Oracle DBMS to store it. Typically, for a single subsystem, thousands of devices and tens of thousands of properties would need to be managed. The paper describes our experiences from the prototype phase and the design and implementation of the production version of the tool. Currently, the implementation is being completed and the tool is being deployed in the experiments' control systems. In this implementation, effort was put on providing the functionality requested by the developers, and providing good performance and scalability.
        Speaker: Piotr Golonka (CERN, IT/CO-BE)
    • Software Components and Libraries: SCL-7 AG 69

      AG 69

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 410
        Development, validation and maintenance of Monte Carlo generators & generator services in the LHC era
        The library of Monte Carlo generator tools maintained by LCG (GENSER) guarantees the centralized software and physics support for the simulation of fundamental interactions, and is currently widely adopted by the LHC collaborations. While the activity in the LCG Phase I was mostly concentrating in the standardization, integration and maintenance of the existing Monte Carlo packages, more emphasis is currently being placed on the contribution to the development, testing and primary validation of new Monte Carlo packages, most of them designed using object oriented technologies. The current status and the future plans of this activity are presented, with emphasis on the testing and validation. The status of MCDB, the public database for the configuration, book-keeping and storage of the generator level event files is also reviewed.
        Speaker: Dr Mikhail Kirsanov (CERN)
        Paper
        Slides
      • 411
        HZTool and Rivet: Toolkit and Framework for the Comparison of Simulated Final States and Data at Colliders
        A common problem in particle physics is the requirement to reproduce comparisons between data and theory when the theory is a (general purpose) Monte Carlo simulation and the data are measurements of final state observables in high energy collisions. The complexity of the experiments, the obervables and the models all contribute to making this a highly non-trivial task. We describe an existing library of Fortran routines, HZTool, which enables, for each measurement of interest, a comparable prediction to be produced from any given Monte Carlo generator. The HZTool library is being maintained by CEDAR, with subroutines for various measurements contributed by a number of authors within and outside the CEDAR collaboration. We also describe the outline design and current status of a replacement for HZTool, to be called Rivet (Robust Independent Validation of Experiment and Theory). This will use an object-oriented design, implemented in C++, together with standard interfaces (such as HepMC and AIDA) to make the new framework more flexible and extensible than the Fortran HZTool.
        Speaker: Dr Ben Waugh (University College London)
        Paper
        Slides
      • 412
        Component approach to HEP Monte Carlo simulations: example of PHOTOS.
        Solving the 'simulation=experiment' equation, which is the ultimate task of every HEP experiment, becomes impossible without computer simulation techniques. HEP Monte Carlo simulations, traditionally written as FORTRAN codes, became complex computational projects: their rich physical content needs to be matched with the software organization of the experimental collaborations to make them a part of large software chains. The experimental collaboration software intrinsically use the component approach: the simulation of physical events and detector responses is typically performed in steps, by a set of dedicated software packages (such as PYTHIA, GEANT), which may be recognized as high-level "components", interacting with each other by means of common data structure: the event record. The component approach to computer simulations is a widely discussed topic, yet not in the branch of HEP MC simulations. In this paper we try to describe the general view and approach, and the problems encountered in this area nowadays: we stress the importance of validation and testing methodology. PHOTOS Monte Carlo is used as an example of a compact package used worldwide; MC-TESTER-based method was developed for its tests and validation. PHOTOS may also serve as an example of a dialog between theoretical and experimental physicists. The involvement of the software-development experts in this dialog, in the spirit of common understanding, would certainly be beneficial to establish viable architectures for future HEP MC simulations, and still be flexible for rearrangements motivated by future (unknown at present) physics requirements.
        Speaker: Mr Piotr Golonka (INP Cracow, CERN)
      • 413
        Eclipse-based Physicist Work Environment
        Eclipse is a popular, open source, development platform and application framework. It provides extensible tools and frameworks that span the complete software development lifecycle. Plugins exist for all the major parts that today make up the physicist software toolkit in ATLAS: programming environments/editors for C++ and python, browsers for CVS and SVN, networking with ssh and sftp, etc. It is therefore a natural choice for an integrated work environment. This paper shows how the ATLAS software environment and framework can be configured, debugged, build, and run with Eclipse. It also presents plugins tailored for ATLAS, which ease installation of the software, development and debugging of job configurations, and (interactive) analysis algorithms in a multi-language environment. Plugins for integrated tutorials and context-sensitive help are also provided, allowing people new to the ATLAS software to quickly get started with their analyses.
        Speaker: Wim Lavrijsen (LBNL)
        Paper
        Slides
    • Software Tools and Information Systems: STIS-7 AG 77

      AG 77

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India
      • 414
        LCG RTAG 12: Collaborative Tools for the LHC
        I report on the findings and recommendations of the LCG Project's Requirements and Technical Assessment Group (RTAG 12) on Collaborative Tools for the LHC. A group comprising representatives of the LHC collaborations, CERN IT and HR, and leading experts in the field of collaborative tools evaluated the requirements of the LHC, current practices, and expected future usage, in comparison with the existing facilities and infrastructure. The final report, CERN-LCG-PEB-2005-07, was published in April, 2005 and subsequently endorsed by the four major LHC collaborations. This talk summarizes the findings and major recommendations, and puts forward some concrete proposals for follow-up and implementation.
        Speaker: Dr Steven Goldfarb (High Energy Physics)
        Paper
        Slides
      • 415
        HepForge: A lightweight open source development environment for HEP software A lightweight open source development environment for HEP software
        Setting up the infrastructure to manage a software project can easily become more work than writing the software itself. A variety of useful open-source tools, such as Web-based viewers for version control systems, "wikis" for collaborative discussions and bug-tracking systems are available but their use in high-energy physics, outside large collaborations, is small. We introduce the CEDAR collaboration's HepForge project, which provides a lightweight, modular development environment for HEP software. Facilities available include the above-mentioned tools as well as mailing lists, shell accounts, archiving of releases and low-maintenance Web space, all centrally backed up. HepForge also exists to promote best-practice software development methods and to provide a central repository for reusable HEP software.
        Speakers: Dr Andy Buckley (Durham University), Andy Buckley (University of Cambridge)
        Paper
        Slides
      • 416
        From VRVS to EVO, the Next Generation Grid-enable Collaborative System
        During this session we will describe and demonstrate the MonALISA (MONitoring Agents using A Large Integrated Services Architecture) and the new enhanced VRVS (Virtual Room Videoconferencing System) systems, and their integration to provide a next generation of collaboration system called EVO. The melding of these two systems creates a distributed intelligent system that provides an efficient collaborative service to a very large dispersed community of users. This real-time system operates over an ensemble of national and international networks (in more than 100 countries). The new features include IM, encryption, automatic troubleshooting detection among others. VRVS is global in scope: it covers the full range of existing and emerging protocols and the full range of client devices for collaboration, from desktops to installations in large auditoria. VRVS that have interconnected users since 1997 and that hold around 3000 hours of meeting per month provides now a mobile collaboration access (for Pocket PC) to its users. The new system EVO, based on VRVS will be demonstrated during the session. The specialized mobile agents in the MonALISA framework optimize data replication strategies for data processing in GRID systems as well as to help and improve the operation of the VRVS. The agents are deployed to all the active MonALISA services and perform supervision tasks for distributed applications. Thus, the auto-adaptive system can detect and face all network problems encountered (congestion, line cut, etc…) to keep unlimited number of the user inter-connected.
        Speaker: Mr Philippe Galvez (California Institute of Technology (CALTECH))
        Slides
      • 417
        Avoiding the tower of Babel syndrom: An integrated issue-based quality assurance system
        Samples of data acquired by the STAR Experiment at RHIC are examined at various stages of processing for quality assurance (QA) purposes. As STAR continues to mature and utilize new hardware and software, it remains imperative to the experiment to work cohesively to insure the quality of STAR data so that the collaboration may continue to produce many new physics results in the efficient and timely manner. From detector sub-system expert specific information, shift crew reports, online QA or offline reconstruction information, how to correlate such a rich set of information would pose a daunting challenge to any collaboration. Presentation of QA results in an organized and integrated fashion has proven vital to establishing robust communication of issues to both operators and users. We will present in this paper the integrated QA system developed to achieve these goals within the STAR experiment, from detector operations through to data production and analysis.
        Speaker: Dr Gene VAN BUREN (BROOKHAVEN NATIONAL LABORATORY)
        Paper
        Slides
    • 15:30
      Tea Break
    • Public Lecture
      Convener: Shobo Bhattacharya (TIFR)
      • 418
        From the World Wide Web to the Grid
        Speaker: Wolfgang Von Rueden (CERN)
        Slides
    • Plenary: Plenary 9

      Plenary Session

      Convener: Matthias Kasemann (DESY)
      • 419
        Present Work and Future Directions of IHEPCCC
        Speaker: Dr Randall Sobie (Univeristy of Victoria)
        Slides
      • 420
        New Frontiers in Data Mining
        Speaker: Lalitesh Kathragadda (Google India)
        Slides
      • 421
        Managing Enterprise Applications in Grid
        Grid Computing technologies are transforming the scientific and enterprise computing in a big way. Especially in the different verticals like Life Sciences, Energy, Finance, there is tremendous pressure to reduce cost and enhance productivity. Grid allows linking up as many processors, storage and/or memory of distributed computers to make more efficient use of all available computing resources to solve large problems quickly. However, enterprise computing presents numerous important issues which need to be considered like the complexities of the systems, integration of different technologies, automation, workflows, etc. All these issues present complexities in terms of application management, deployment, and development. In the talk, we will discuss about these complexities and the several research initiatives (including a case study) taken in Software Engineering and Technology Labs (SETLabs) in Infosys Technologies.
        Speaker: Anirban Chakrabarti (Infosys)
        Slides
      • 422
        Summary of the Track on Online Computing
        Speaker: Dr Beat Jost (CERN)
        Slides
      • 423
        Summary of the track on Event Processing Applications
        Speaker: Dr Gabriele Cosmo (CERN)
        Slides
      • 424
        Summary of the Track on Software Components and Libraries
        Speaker: Dr Lorenzo Moneta (CERN)
        Slides
    • Plenary: Plenary 10

      Plenary Session

      • 425
        Welcome etc
        Welcome by Director, TIFR Address by Governor, Maharashtra National Anthem
      • 426
        Address by the President of India
        Paper
        Slides
    • 13:30
      Lunch Break
    • Plenary: Plenary 11

      Plenary Session

      Convener: Randall Sobie (University of Victoria)
      • 427
        Summary of the Track on Software Tools and Information Systems
        Speaker: Dr Andreas Pfeiffer (CERN)
        Slides
      • 428
        Summary of the Track On Computing Facilities and Networking
        Speakers: Simon Lin, Dr Simon Lin (Academia Sinica Grid Computing Centre)
        Slides
      • 429
        Summary of the Track on Grid Middleware and e-Infrastructure Operation
        Speaker: Mr Markus Schulz (CERN)
        Slides
      • 430
        Summary of the Track on Distributed Event Production and Processing
        Speaker: Dr Gavin McCance (CERN)
        Slides
      • 431
        Summary of the Track on Distributed Data Analysis
        Speaker: Fons Rademakers (CERN)
        Slides
    • 16:15
      Tea Break
    • Panel Discussion on Digital Divide Auditorium

      Auditorium

      Tata Institute of Fundamental Research

      Homi Bhabha Road Mumbai 400005 India