ACAT 2008

Name: ACAT 2008
Start: 2008-11-03T08:00:00+01:00
End: 2008-11-07T18:00:00+01:00
Location: Ettore Majorana Foundation and Centre for Scientific Culture

3 Nov 2008, 08:00 → 7 Nov 2008, 18:00 Europe/Zurich

Ettore Majorana Foundation and Centre for Scientific Culture

Via Guarnotta, 26 - 91016 ERICE (Sicily) - Italy Tel: +39-0923-869133 Fax: +39-0923-869226 E-mail: hq@ccsem.infn.it

2008 ACAT, Federico Carminati (CERN)

Description

We are very happy to invite you to this exceptional session of the ACAT series (12th) that will mark a new turning point in the cross-fertilization of hot physics research and computing technology.

Support

acat2008@cern.ch

Participants

138 View full list

Monday 3 November
- Mon 3 Nov
- Tue 4 Nov
- Wed 5 Nov
- Thu 6 Nov
- Fri 7 Nov
- 08:45 → 10:20
  Monday, 03 November 2008 - Morning session 1
  - 08:45
    
    Introduction to ACAT 2008 10m
    
    Speaker: Mr Federico Carminati (CERN)
    
    Slides
  - 08:55
    
    Introduction to Morning Session 5m
    
    Speaker: Denis Perret-Gallix (Laboratoire d'Annecy-le-Vieux de Physique des Particules (LAPP))
  - 09:00
    
    Aspects of Intellectual Property Law for HEP Software Developers 40m
    
    Intellectual Property, which includes the following areas of the law: Copyrights, Patents, Trademarks, Trade Secrets, and most recently Database Protection and Internet Law, might seem to be an issue for lawyers only. However, increasingly the impact of the laws governing these areas and the International reach of the effects of their implementation makes it important for all software developers and especially software development team leaders to be aware of the implications that might exist. For example, under copyright law, if their are multiple contributors to a software development project, the contribution of even one single line of code to million line program, will vest an equal copyright ownership share in the person making that one-line contribution. While patenting of software is being debated in Europe, it is allowed in the US, and with the Grid and the Internet, the effects of these patents can reach far beyond the US border. Likewise, the recent enactment of database protection in Europe, which grants property rights to the facts contained in databases can still have impacts on users in the US where such legislation has not been put in force. A brief primer on the basics of Intellectual Property Law relevant for software developers will be provided.
    
    Speaker: Lawrence Pinsky (University of Houston-Unknown-Unknown)
    
    Slides
  - 09:40
    
    The Blue Brain Project - Simulation-based Research in Neuroscience 40m
    
    The initial phase of the Blue Brain Project aims to reconstruct the detailed cellular structure and function of the neocortical column (NCC) of the young rat. As a collaboration between the Brain Mind Institute of the Ecole Polytechnique Federale de Lausanne (EPFL) and IBM the project is based on the many years of experimental data from an electrophysiology lab and a dedicated massively parallel computing resource (4-rack BlueGene/L). Over the last 3 years an interdisciplinary team of 35 researchers has cast the reverse- engineering of the biological pieces and the forward construction of detailed mathematical models in an iterative process that allows continuous refinement. Particular efforts go into the preparation of 10,000 unique morphologically-complex electrical models representing all morpho-electrical classes as well as establishing their structural and functional connectivity. Once a multi-compartmental description for each neuron is generated and the exact locations of the synapses (~30 million) are determined, the simulation is supposed to reproduce emergent properties found in slice experiments. The refinement is directed by a bottom-up calibration that aligns the model across all levels - from the ion channels to the emergent network phenomena - with the experimental data. In order to put the expert in the loop, extensive use of visualization and interactive analysis is made, which is powered by another dedicated supercomputer in order to realize short turn-around times.
    
    Speaker: Mr Felix Schuermann
- 10:20 → 10:40
  
  Coffee Break 20m
- 10:40 → 12:40
  Monday, 03 November 2008 - Morning session 2
  - 10:40
    
    Higgs bbbar decay at NNLO and beyond : the uncertainties of QCD predictions 40m
    
    Different methods for treating the results of higher-order perturbative QCD calculations of the decay width of the Standard Model Higgs boson into bottom quarks are discusssed. Special attention is paid to the analysis of the $M_H$ dependence of the decay width $\Gamma(H\to \bar{b}b})$ in the cases when the mass of b-quark is defined as the running parameter in the $\bar{MS}$-scheme and as the quark pole mass. The relation between running and pole masses is taken into account in the order $\alpha_s^4$-approximation. The way of fixation of theoretical uncertainties of the f QCD predictions are fproposed. Thev results may be of interest in the process of matching with the results ofexisting NNLO Monte Carlo program to compute Higgs boson production at hadron colliders at NNLO
    
    Speaker: Dr Andrei Kataev (Institute for Nucleaer Research , Moscow, Russia)
    
    Slides
  - 11:20
    
    High-Precision Arithmetic and Mathematical Physics 40m
    
    For the vast majority of computations done both in pure and applied physics, ordinary 64-bit floating-point arithmetic (about 16 decimal digits) is sufficient. But for a growing body of applications, this level is not sufficient. For applications such as supernova simulations, climate modeling, n-body atomic structure calculations, "double-double" (approx. 32 digits) or even "quad-double" (approx. 64 digits) is required. For yet other applications, notably arising in quantum field theory and statistical mechanics, much higher precision (hundreds or even thousands of digits) is required. Armed with software for performing computation to these high levels of precision, the tools of "experimental mathematics" can be brought to bear, such as in recognizing the values of definite integrals that arise in the theory via their decimal values. Numerous recent studies, including some rather remarkable results in quantum field theory, will be mentioned.
    
    Speaker: David Bailey (Lawrence Berkeley Laboratory)
    
    Slides
  - 12:00
    
    Are SE Architectures Ready For LHC 40m
    
    There are many ways to build a Storage Element. This talk surveys the common and popular architectures used to construct today's Storage Elements and presents points for consideration. The presentation then asks, "Are these architectures ready for LHC era experiments?". The answer may be surprising and certainly shows that the context in which they are used matters.
    
    Speaker: Mr Andrew Hanushevsky (Stanford Linear Accelerator Center (SLAC))
    
    Slides
- 14:00 → 18:15
  Computing Technology for Physics Research
  - 14:00
    
    g-Eclipse - user and developer friendly access to Grids and Clouds 25m
    
    g-Eclipse is both a user friendly graphical user interface and a programming framework for accessing Grid and Cloud infrastructures. Based on the extension mechanism of the well known Eclipse platform, it provides a middleware independent core implementation including standardized user interface components. Based on these components, implementations for any available Grid and Cloud middleware can be added by using the Eclipse plug-in mechanism. Currently g-Eclipse provides support for the gLite and GRIA Grid middlewares, as well as for the Amazon WebServices cloud computing (EC2) and storage (S3) offers. The data management component enables a seamless and interworking storage access across different middlewares. I.e. data from Amazon S3 can be moved to gLite resources just by the drag-and-drop mechanism. Furthermore, the tool provides Grid job management functionality including a JSDL standard conformant job description editor, automatic update of submitted jobs' status, parametric job creation and direct access to the job input and output files. Many other elements contribute to the g-Eclipse eco system to make up for a fully integrated user, operator and developer environment, like a graphical workflow editor, a batch system management component, a resources test framework, scientific visualization support and others. g-Eclipse provides a variety of integrated tools for the future e-Scientist working on the Grid and other emerging e-Infrastructures. In this presentation we will introduce the framework and demonstrate the tool online by accessing the different supported infrastructures. The presentation will focus on the user and the developer's perspectives.
    
    Speaker: Dr Ariel Garcia (FORSCHUNGSZENTRUM KARLSRUHE, GERMANY)
    
    Slides
  - 14:25
    
    Dedicated Services to Support Data Replication over GRID using ATLAS Distributed Data Management System. 25m
    
    There is ATLAS wide policy how different types of data is distributed between centers of different level (T0/T1/Tn) it is well defined and centrally operated activity (uses Atlas Central Services which include Catalogue services, Sites services, T0 services, Panda Services and etc). At the same ATLAS Operations Group designed user oriented services to allow ATLAS physicists to place data replication request and using Distributed Data Management (DDM) as a low level distribute data between more than 70 sites. The DDM System consists of a bookkeeping system (dataset-based) and a set of local site services to handle data transfers, building upon Grid technologies. The software stack is called DQ2 [1]. So replication methods for physicist’s requests include DQ2 service, DQ2 End-User tools [2], Web-based interface for user’s requests (which is using internally DQ2 service) [3] and for replication control uses DDM/DQ2 data transfer monitoring. These methods supplement each other, because all of them are having some restrictions and policies, which are defined by ATLAS Operations Group or corresponding cloud coordinators. [1] https://twiki.cern.ch/twiki/bin/view/Atlas/DistributedDataManagement [2] https://twiki.cern.ch/twiki/bin/view/Atlas/DDMEndUserTutorial [3] http://panda.atlascomp.org?mode=reqsubs0
    
    Speaker: Mikhail Titov (Moscow Physical Engineering Inst. (MePhI))
    
    Slides
  - 14:50
    
    The PanDA System in the ATLAS Experiment 25m
    
    The PanDA system was developed by US ATLAS to meet the requirements for full scale production and distributed analysis processing for the ATLAS Experiment at CERN. The system provides an integrated service architecture with late binding of job, maximal automation through layered services, tight binding with the ATLAS Distributed Data Management system, advanced job recovery and error discovery functions, among other features. This talk will present the components and performance of the PanDA system which has been in production in the US since early 2006 and ATLAS wide since the end of 2007.
    
    Speaker: Paul Nilsson (University of Texas at Arlington)
    
    Slides
  - 15:15
    
    ALICE Analysis Framework 25m
    
    The talk will describe the current status of the offline analysis framework used in ALICE. The software was designed and optimized to take advantage of distributed computing resources and be compatible with ALICE computing model. The framwork's main features: possibility to use parallelism in PROOF or GRID environments, transparency of the computing infrastructure and data model, scalability and data access performance. The framework provides a common "language" for all ALICE analysis users and is being heavily tested in view of the data to arrive soon.
    
    Speaker: Mr Andrei Gheata (ISS/CERN)
    
    Slides
  - 15:40
    
    Coffee Break 30m
  - 16:10
    
    Distributed analysis in CMS using CRAB: the client-server architecture evolution and commissioning 25m
    
    CRAB (CMS Remote Analysis Builder) is the tool used by CMS to enable running physics analysis in a transparent manner over data distributed across many sites. It abstracts out the interaction with the underlying batch farms, grid infrastructure and CMS workload management tools, such that it is easily usable by non-experts. CRAB can be used as a direct interface to the computing system or can delegate the user task to a server. Major efforts have been dedicated to the client-server system development, allowing the user to deal only with a simple and intuitive interface and to delegate all the work to a server. The server takes care of handling the users jobs during the whole lifetime of the users task. In particular, it takes care of the data and resources discovery, process tracking and output handling. It also provides services such as automatic resubmission in case of failures, notification to the user of the task status, and automatic blacklisting of sites showing evident problems beyond what is provided by existing grid infrastructure. The CRAB server architecture and its deployment will be presented, as well as the current status and future development. In addition the experience in using the system for initial detector commissioning activities and data analysis will be summarized.
    
    Speaker: Giuseppe Codispoti (Dipartimento di Fisica)
    
    Slides
  - 16:35
    
    Mass Storage System for Disk and Tape resources at the Tier1. 25m
    
    The activities in the last 5 years for the storage access at the INFN CNAF Tier1 can be enlisted under two different solutions efficiently used in production: the CASTOR software, developed by CERN, for Hierarchical Storage Manager (HSM), and the General Parallel File System (GPFS), by IBM, for the disk resource management. In addition, since last year, a promising alternative solution for the HSM, using Tivoli Storage Manager (TSM) and GPFS, has been under intensive test. This paper reports the description of the current hardware and software installation with an outlook on the last GPFS and TSM tests results.
    
    Speaker: Pier Paolo Ricci (INFN CNAF)
    
    Slides
  - 17:00
    
    ATLAS Handling Problematic Events in Quasi Real-Time 25m
    
    The ATLAS experiment at CERN will require about 4000 CPUs for the online data acquisition system (DAQ). When the DAQ system experiences software errors, such as event selection algorithm problems, crashes or timeouts, the fault tolerance mechanism routes the corresponding event data to the so called debug stream. During first beam commissioning and early data taking, a large fraction of events is expected to end up in this stream. In order to identify problems with the DAQ as soon as possible and reduce the turn-around time for fixing these problems, it is of prime importance to treat the debug stream. We have adopted a quasi real-time approach. We have developed an automated system that analyzes the contents of the debug stream and provides fine grained error classification. A high percentage of error events is related to online transient problems. Many of those events are recovered by feeding them to an independent system that reruns the trigger software. To be flexible in terms of computing power requirements, we added a layer of abstraction over the computing backend. This gives the possibility of using the Grid as well as dedicated resources. Using cosmic ray runs, we validated the automatic error analysis and recovery procedure.
    
    Speaker: Hegoi Garitaonandia (NIKHEF)
    
    Slides
  - 17:25
    
    Large Scale Job Management and Experience in Recent Data Challenges within the LHC CMS experiment. 25m
    
    From its conception the job management system has been distributed to increase scalability and robustness. The system consists of several applications (called prodagents) which each manage Monte Carlo, reconstruction and skimming jobs on collections of sites within different Grid environments (OSG, NorduGrid?, LCG) and submission systems (GlideIn?, local batch, etc..). Production of simulated data in CMS will take place mainly on so called Tier2s (small to medium size computing centers) resources. Approximately ~50% of the CMS Tier2 resources are allocated to running simulation jobs. While the so called Tier1s (medium to large size computing centers with high capacity tape storage systems) will be mainly used for skimming and reconstructing detector data. During the last one and a half years the system has also been adapted such that it can be configured for converting Data Acquisition (DAQ)/ High Level Trigger (HLT) output from the CMS detector to the CMS data format and manage the real time data stream from the experiment. Simultaneously the system has been upgraded to facilitate the increasing scale of the CMS production and adapting to the procedures used by its operators. In this paper we discuss the current (high level) architecture of ProdAgent, the experience in using this system in computing challenges, feedback from these challenges, and future work including migration to a set of core libraries to facilitate convergence between the different data management projects within CMS that deal with analysis, simulation, and initial reconstruction of real data. This migration is important as it will decrease the code footprint used by these projects and increase maintainability of the code base.
    
    Speaker: Dr Stuart Wakefield (Imperial College London)
    
    Slides
  - 17:50
    
    MDT data quality assessment at the Calibration centre for the ATLAS experiment at LHC 25m
    
    ATLAS is a large multipurpose detector, presently in the final phase of construction at LHC, the CERN Large Hadron Collider accelerator. In ATLAS the muon detection is performed by a huge magnetic spectrometer, built with the Monitored Drift Tube (MDT) technology. It consists of more than 1,000 chambers and 350,000 drift tubes, which have to be controlled to a spatial accuracy better than 10 micrometers and an efficiency close to 100%. Therefore, the detector automated monitor is an essential aspect of the operation of the spectrometer. The quality procedure collects data from online and offline sources and from the "calibration stream" at the calibration centres, situated in Ann Arbor (Michigan), MPI (Munich) and INFN Rome. The assessment at the Calibration Centres is performed using the DQHistogramAnalyzer utility of the Athena package. This application checks the histograms in an automated way and, after a further inspection with a human interface, reports results and summaries. In this study a complete description of the entire chain, from the calibration stream up to the database storage is presented. Special algorithms have been implemented in the DQHistogramAnalyzer for the Monitored Drift Tube chambers. A detailed web display is provided for easy data quality consultation. The analysis flag is stored inside an Oracle Database using the COOL LCG library, through a C++ object-oriented interface. This quality flag is compared with the online and offline results, produced in a similar way, and the final decision is stored in a DB using a standalone C++ tool. The final DB, which uses the same COOL technology, is accessed by the reconstruction and analysis programs.
    
    Speakers: Dr Elena Solfaroli (INFN RomaI & Universita' di Roma La Sapienza), Dr Monica Verducci (INFN RomaI)
    
    Slides
- 14:00 → 18:15
  Data Analysis - Algorithms and Tools
  - 14:00
    Multi-threaded event processing with JANA 25m
    
    The C++ reconstruction framework JANA has been written to support the next generation of Nuclear Physics experiments at Jefferson Lab in anticipation of the 12GeV upgrade. This includes the GlueX experiment in the planned 4th experimental hall "Hall-D". The JANA framework was designed to allow multi-threaded event processing with a minimal impact on developers of reconstruction software. As we enter the multi-core era, thread-enabled code will become essential to utilizing the full processor power available without invoking the logistical overhead of managing many individual processes. Event-based reconstruction lends itself naturally to mutli-threaded processing. Emphasis will be placed on the multi-threading features of the framework. Test results of the scaling of event processing rates with number of threads will be shown.
    
    Speaker: Dr David Lawrence (Jefferson Lab)
    
    Paper
    
    Slides
    
    http://argus.phys.uregina.ca/gluex/DocDB/0011/001133/002/Lawrence_JANA.pdf
    
    http://argus.phys.uregina.ca/gluex/DocDB/0011/001133/002/Lawrence_JANA.tgz
    
    http://argus.phys.uregina.ca/gluex/DocDB/0011/001133/002/Lawrence_JANA_win.tgz
  - 14:25
    
    TMVA- the toolkit for multivariate data analysis 25m
    
    Multivariate data analysis techniques are becoming increasingly important for high energy physics experiments. TMVA is a tool, integrated in the ROOT environment, which allows easy access to sophisticated multivariate classifiers allowing for a widespread use of these very effective data selection techniques. It furthermore provides a number of pre-processing capabilities and numerous additional evaluation features that help in getting the most out of the data. The talk gives an overview of the TMVA package, and presents new features and developments.
    
    Speaker: Joerg Stelzer (DESY)
    
    Slides
  - 14:50
    
    PDE-FOAM - a probability-density estimation method based on self-adapting phase-space binning 25m
    
    Probability-Density Estimation (PDE) is a multivariate discrimination technique based on sampling signal and background densities in a multi-dimensional phase space. The signal and background densities are defined by event samples (from data or monte carlo) and are evaluated using a binary search tree (range searching). This method is a powerful classification tool for problems with highly non-linearly correlated observables. In this paper, we present an innovative improvement of the PDE method that uses a self-adapting binning method to divide the multi-dimensional phase space in a finite number of hyper-rectangles (boxes). For a given number of boxes, the binning algorithm adjusts the size and position of the boxes inside the multidimensional phase space, minimizing the variance of the signal and background densities inside the boxes. The binned density information is stored in binary trees, allowing for a very fast and memory-efficient classification of events. The implementation of the binning algorithm (PDE-FOAM) is based on the monte-carlo integration package TFOAM included in the analysis package ROOT and has been developed within the framework of the Toolkit for Multivariate Data Analysis with ROOT (TMVA). We present performance results for representative examples (toy models) and discuss the dependence of the obtained results on the choice of parameters. The new PDE-FOAM is compared to the original PDE method based on range-searching.
    
    Speaker: Dr Dominik Dannheim (CERN)
    
    Slides
  - 15:15
    
    The Role of Interpreters in High Performance Computing 25m
    
    Compiled code is fast, interpreted code is slow. There is not much we can do about it, and it's the reason why interpreters use in high performance computing is usually restricted to job submission. I will show where interpreters make sense even in the context of analysis code, and what aspects have to be taken into account to make this combination a success.
    
    Speaker: Axel Naumann (CERN)
    
    Slides
  - 15:40
    
    Coffe Break 30m
  - 16:10
    
    The ATLAS Conditions Database Model for the Muon Spectrometer 25m
    
    The ATLAS Muon System has extensively started to use the LCG conditions database project 'COOL' as the basis for all its conditions data storage both at CERN and throughout the worlwide collaboration as decided by the ATLAS Collaboration. The management of the Muon COOL conditions database will be one of the most challenging applications for Muon System, both in terms of data volumes and rates, but also in terms of the variety of data stored. The Muon Conditions database is responsible for almost of all the 'non-event' data and detector quality flags storage needed for debugging of the detector operations and for performing reconstruction and analysis. The COOL database allows database applications to be written independently of the underlying database technology and ensures long-term compatibility with entire ATLAS Software. COOL implements an interval of validity database, i.e. objects stored or referenced in COOL have an associated start and end time between which they are valid, the data is stored in folders, which are themselves arranged in a hierarchical structure of foldersets. The structure is simple and mainly optimsed to store and retrieve object(s) associated to a particular time. In this work, an overview of the entire Muon Conditions Database architecture is given, including the different sources of the data and the storage model used. In addiction the software interfaces used to access to the Conditions Data are described, more emphasis is given to the Offline Reconstruction framework ATHENA and the services developed to provide the Conditions data to the reconstruction.
    
    Speaker: Dr Monica Verducci (INFN RomaI)
    
    Slides
  - 16:35
    
    PARADIGM, a Decision Making Framework for Variable Selection and Reduction in High Energy Physics 25m
    
    In high energy physics, variable selection and reduction are key to a high quality multivariate analysis. Initial variable selection often leads to a variable set cardinality greater than the underlying degrees of freedom of the model, which motivates the needs for variable reduction and more fundamentally, a consistent decision making framework. Such a framework called PARADIGM, based on a global reduction criterion called the gloss function and relevant to searches for new phenomena in physics, is described. We illustrate the common pitfalls of variable selection and reduction associated with variable interactions and show that PARADIGM gives consistent results in their presence. We discuss PARADIGM’s application to several HEP searches for new phenomena and compare the performance of different measures of relative variable importance, in particular with those based on binary regression. Finally we describe a technique called variable amplification that shows how PARADIGM’s results lead to improved classification performance.
    
    Speaker: Sergei V. Gleyzer (Florida State University)
    
    Slides
  - 17:00
    
    Track Reconstruction and Muon Identification in the Muon Detector of the CBM Experiment at FAIR 25m
    
    The Compressed Baryonic Matter (CBM) experiment at the future FAIR accelerator at Darmstadt is being designed for a comprehensive measurement of hadron and lepton production in heavy-ion collisions from 8-45 AGeV beam energy, producing events with large track multiplicity and high hit density. The setup consists of several detectors, including the silicon tracking system (STS) placed in a dipole magnet close to the target region, and the MUCH (MUon CHamber) detector. The MUCH detector is aimed for muon identification down to momenta of 1.5 GeV/c. It consists of a sequence of absorber layers and detector stations. The concept for the MUCH detector and the status of the track reconstruction software are presented in this contribution. The reconstruction software is organized to be flexible with respect to feasibility studies of different physics channels and to optimization of the detector itself. The main blocks of the reconstruction package include track finding, fitting and propagation in the material of the detector. The track finding algorithm is based on the track following method with branches, using tracks reconstructed in the STS as initial seeds. The magnetic field is taken into account during extrapolation through the detector. Track propagation in the material includes accurate calculation of multiple scattering and energy losses. The performance of the track propagation is shown to be similar to the results of GEANE. Track parameters and covariance matrices are estimated using the Kalman filter method. At the final competition, tracks with larger number of hits and with better chi-square value are more preferable. The track reconstruction efficiency for muons embedded in central Au+Au collisions at 25 AGeV beam energy from the UrQMD model is at the level of 95%. In these collisions, feasibility studies of low mass vector meson measurements in the dimuon channel result in 1.7% total reconstruction efficiency of the omega meson and a signal to background ratio of 0.15. Currently, ongoing work focuses on detector layout studies in order to optimize the detector setup while keeping a high performance.
    
    Speaker: Mr Andrey Lebedev (GSI, Darmstadt / JINR, Dubna)
    
    Slides
  - 17:25
    
    Fireworks: A Physics Event Display for CMS 25m
    
    Event displays in HEP are used for many different purposes, e.g. algorithm debugging, commissioning, geometry checking and physics studies. The physics studies case is unique since few user are likely to become experts on the event display, the breadth of information all such users will want to see is quite large although any one user may only want a small subset of information and the best way to display physics information sometimes requires a stylized rather than 3D accurate representation. Fireworks is a CMS event display which is specialized for the physics studies case. Fireworks provides an easy to use interface which allows a physicist to concentrate only on the data to which they are interested. Data is presented via graphical and textual views. Cross view data interpretation is easy since the same object is shown using the same color in all views and if the object is selected it is highlighted in all views. Objects which have been selected can be further studied by displaying a detailed view of just that object. Physicists can select which events (e.g. require a high energy muon), what data (e.g. which track list) and which items in a collection (e.g. only high-pt tracks) to show. Once the physicist has configured Fireworks to their liking they can save the configuration. Fireworks is built using the Eve subsystem of the CERN ROOT project and CMS's FWLite project. The FWLite project was part of CMS's new Event Data Model and recent code redesign which separates data classes into libraries separate from algorithms producing the data and uses ROOT directly for C++ object storage thereby allowing the data classes to be used directly in ROOT. The Fireworks project released its first Linux and mac version this summer and has received much positive feedback.
    
    Speaker: Dr Christopher Jones (CORNELL UNIVERSITY)
    
    Slides
  - 17:50
    
    An overview of the b-Tagging algorithms in the CMS Offline software 25m
    
    The CMS Offline software contains a widespread set of algorithms to identify jets originating from the weak decay of b-quarks. Different physical properties of b-hadron decays like lifetime information, secondary vertices and soft leptons are exploited. The variety of selection algorithms range from simple and robust ones, suitable for early data-taking and online environments as the trigger system, to highly discriminating ones, exploiting all the information available. For the latter, a generic discriminator computing framework has been developed that allows to exploit the full power of multi-variate analysis techniques in an flexible way.
    
    Speaker: Mr Christophe Saout (CMS, CERN & IEKP, University of Karlsruhe)
    
    Slides
- 14:00 → 15:40
  Methodology of Computations in Theoretical Physics - Session 1
  - 14:00
    
    Unitarity Methods For 1-Loop Amplitudes 25m
    
    Unitarity methods provide an efficient way of calculating 1-loop amplitudes for which Feynman diagram techniques are impracticable. Recently several approaches have been developed that apply these techniques to systematically generate amplitudes. The 'canonical basis' implementation of the unitarity method will be discussed in detail and illustrated using seven point QCD processes.
    
    Speaker: Warren Perkins (Swansea University UK)
    
    Slides
  - 14:25
    
    From moments to functions in higher order QCD 25m
    
    We present a method to unfold the complete functional dependence of single-scale quantities as QCD splitting functions and Wilson coefficients from a finite number of moments. These quantities obey recursion relations which can be found in an automated way. The exact functional form is obtained solving the corresponding difference equations. We apply the algorithm to the QCD Wilson coefficients for deep-inelastic scattering and splitting functions to 3-loop order which are associated with difference equations of rather high order and degree.
    
    Speaker: Johannes Bluemlein (DESY)
    
    Slides
  - 14:50
    
    Computational aspects for three-loop DIS calculations 25m
    
    will be sent later
    
    Speaker: Dr Mikhail Rogal (DESY)
    
    Slides
  - 15:15
    
    Recent developments of GRACE 25m
    
    Automatic Feynman-amplitude calculation system, GRACE, has been extended to treat next-to-leading order (NLO) QCD calculations. Matrix elements of loop diagrams as well as those of tree level ones can be generated using the GRACE system. A soft/collinear singularity is treated using a leading-log subtraction method. Higher order re-summation of the soft/collinear correction by the parton shower method is combined with the NLO matrix-element without any double-counting in this method. Event-generators created using a GRACE system are implemented in the GR@PPA, which is an event-generator framework for high energy hadron-collision interactions.
    
    Speaker: Dr Yoshimasa KURIHARA (KEK)
    
    Slides
- 15:40 → 16:10
  
  Coffee Break 30m
- 16:10 → 18:15
  Methodology of Computations in Theoretical Physics - Session 2
  - 16:10
    
    Recent Progress of Geant4 Electromagnetic Physics and Readiness for the LHC Start 25m
    
    The status of Geant4 electromagnetic (EM) physics models is presented, focusing on the models most relevant for collider HEP experiments, at LHC in particular. Recently improvements were undertaken in models for the transport of electrons and positrons, and for hadrons. Models revised included those for single and multiple scattering, ionization at low and high energies, bremsstrahlung, annihilation, scintillation and Cerenkov. Validation has been performed against experimental data. Typical results of comparisons are shown. There was a significant update of the bremsstrahlung models. This introduced a new description of the relativistic regime for electrons and positrons, which describes precisely the recent LPM experiment at CERN. New models for bremsstrahlung and electron-positron pair production by hadrons were introduced. A significant effect is observed due to the bremsstrahlung of pions is observed, affecting the signal in EM calorimeters of LHC detectors. With a focus on the LHC start-up, we discuss performance versus precision of different configurations of EM physics.
    
    Speaker: Prof. Vladimir Ivantchenko (CERN, ESA)
    
    Slides
  - 16:35
    
    Standard SANC Modules 25m
    
    Two types of SANC system output are presented. At first the status of stand-alone packages for calculations of the EW and QCD NLO RC at the parton level (Standard SANC FORM and/or FORTRAN Modules) are done. Short overview of these packages in sector of the Neutral Current: (uu, dd) -> (mu,mu, ee) and ee(uu, dd) -> HZ; and in the sector of the Charge Current: ee(uu, dd) -> (mu nu_mu, e nu_e) are described. In addition second type of SANC output -- MC event generator for production of event distributions at the hadronic level, based on the FOAM algorithm, are demonstrated.
    
    Speaker: Vladimir Kolesnikov (Joint Institute for Nuclear Research (JINR))
    
    Slides
  - 17:00
    
    Mathematical model of magnetically interacting rigid bodies 25m
    
    Dynamics of two bodies, which interacts by magnetic forces, is considered. Model of interaction builds on quasi-stationary approach for electromagnetic field, and symmetric rotors with different moments of inertia of the bodies are considered. Interaction energy general form is discovered for the case of coincidence of mass and magnetic symmetries. Since the energy of interaction depends only from relative position of the bodies, then the consideration is too much simplified in c.m. system, notwithstanding that force is noncentral. The task requires development of the classic Hamilton formalism for the systems of magnetically interactive bodies, including the systems of the magnets and/or superconductive magnets (mixed systems). Hamilton motion equations are obtained on the basis of Poisson structure in the dynamic variables area. Such approach allows represent the equations in galilei-invariant vector form in contrast to default definition in Euler's angles. Invariance laws follow from system symmetry is considered. This variant of Hamilton formalism easily spreads in the case of arbitrary number of magnetically interactive symmetric symmetric rotors. All equations with Poisson brackets are tested with symbolic features of Maple system. For the numeral modelling of magnetic rigid bodies dynamics Maple and MATLAB packages are used. The obtained mathematical model allows investigate the possibility of orbital motion in the system of magnetically interactive bodies.
    
    Speaker: Dr Stanislav Zub (National Science Center, Kharkov Institute of Physics and Techn)
    
    Slides
  - 17:25
    
    50m
Tuesday 4 November
- Mon 3 Nov
- Tue 4 Nov
- Wed 5 Nov
- Thu 6 Nov
- Fri 7 Nov
- 09:00 → 10:20
  Tuesday, 04 November 2008 - Morning session 1
  - 09:00
    
    Java based software for High-Energy and Astro-physics 40m
    
    This talk will give a brief overview of the features of Java which make it well suited for use in High-Energy and Astro-physics, including recent enhancements such as the addition of parameterized types and advanced concurrency utilities, and its release as an open-source (GPL) product. I will discuss the current status of a number of Java based tools for High-Energy and Astro-physics including JAS (GUI based analysis tool), WIRED (event display), AIDA (analysis toolkit). I will give examples of their use for building web-based and GUI based applications citing examples from GLAST (recently renamed the Fermi Gamma-Ray Space Telescope) and linear-collider detector R&D. I will also discuss the methodologies employed in developing such toolkits, challenges involved in supporting them, and lessons that can be learned for the future.
    
    Speaker: Tony Johnson (SLAC)
    
    Slides
  - 09:40
    
    Introduction to The LLVM Compiler System 40m
    
    This talk gives a high level introduction to the LLVM Compiler System (http://llvm.org/), which supports high performance compilation of C and C++ code, as well as adaptive runtime optimization and code generation. Using LLVM as a drop-in replacement for GCC offers several advantages, such as being able to optimize across files in your application, producing better generated code performance, and doing so with reduced compile times.
    
    Speaker: Mr Chris Lattner
    
    Slides
- 10:20 → 10:40
  
  Coffee Break 20m
- 10:40 → 12:00
  Tuesday, 04 November 2008 - Morning session 2
  - 10:40
    
    Data Analysis with PROOF 40m
    
    In this talk we describe the latest developments in the PROOF system. PROOF is the parallel extension of ROOT and allows large datasets to be processed in parallel on large clusters and/or multi-core machines. The recent developments have focused on readying PROOF for the imminent data analysis tasks of the LHC experiments. Main improvements have been made in the areas of overall robustness and fault tolerance, multi-user scheduling, generic processing, e.g. Monte Carlo's, and optimizations for many-core architectures. ALICE is deploying PROOF for prompt reconstruction and analysis on the CERN Analysis Facility (CAF) and ATLAS if focusing on Tier-3 deployment.
    
    Speaker: Dr Gerardo Ganis (CERN)
    
    Slides
  - 11:20
    
    MonALISA : A Distributed Service System for Monitoring, Control and Global Optimization 40m
    
    The MonALISA (Monitoring Agents in A Large Integrated Services Architecture) framework provides a set of distributed services for monitoring, control, management and global optimization for large scale distributed systems. It is based on an ensemble of autonomous, multi-threaded, agent-based subsystems which are registered as dynamic services. They can be automatically discovered and used by other services or clients. The distributed agents can collaborate and cooperate in performing a wide range of management, control and global optimization tasks using real time monitoring information. An essential part of managing global-scale systems is a monitoring system that is able to monitor and track in real time many site facilities, networks, and tasks in progress. The monitoring information gathered is essential for developing the required higher level services, the components that provide decision support and some degree of automated decisions and for maintaining and optimizing workflow in large scale distributed systems. These management and global optimization functions are performed by higher level agent-based services. Current applications of MonALISA’s higher level services include optimized dynamic routing, control and optimization for large scale data transfers on dedicated circuits, data transfers scheduling, distributed job scheduling and automated management of remote services among a large set of grid facilities. MonALISA is currently used around the clock in several major projects and has proven to be both highly scalable and reliable. More than 320 services are running at sites around the world, collecting information about computing facilities, local and wide area network traffic, and the state and progress of the many thousands of concurrently running jobs.
    
    Speaker: Iosif Legrand (CALTECH)
    
    Slides
- 14:00 → 15:40
  Computing Technology for Physics Research - Session 1
  - 14:00
    
    Profiling Post-GRID analysis 25m
    
    An impressive amount of effort has been put in to realize a set of frameworks to support analysis in this new paradigm of GRID computing. However, much more than half of a physicist's time is typically spent after the GRID processing of the data. Due to the private nature of this level of analysis, there has been little common framework or methodology. While most physicists agree to use ROOT as the basis of their analysis, a number of approaches are possible for the implementation of the analysis using ROOT: conventional methods using CINT/ACLiC, development using g++, alternative interface through python, and parallel processing methods such as PROOF are some of the choices currently available on the market. Furthermore, in the ATLAS collaboration an additional layer of technology adds to the complexity because the data format is based on the POOL technology, which tends to be less portable. In this study, various modes of ROOT analysis are profiled for comparison with the main focus on the processing speed. Input data is or derived from the ATLAS Full-Dress-Rehearsal, which was meant to stress test the whole computing system of ATLAS.
    
    Speaker: Dr Akira Shibata (New York University)
  - 14:25
    
    Distributed Computing in ATLAS 25m
    
    The LHC machine has just started operations. Very soon, Petabytes of data from the ATLAS detector will need to be processed, distributed worldwide, re-processed and finally analyzed. This data-intensive physics analysis chain relies on a fabric of computer centers on three different sub-grids: the Open Science Grid, the LHC Computing Grid and the Nordugrid Data Facility--all part of the Worldwide LHC Computing Grid (wLCG). This fabric is arranged in a hierarchy of computing centers form Tier0 to Tier3. The role of the Tier-0 center is to perform prompt reconstruction of the raw data coming from the on-line data acquisition system, and to distribute raw and reconstructed data to the associated Tier-1 centers. The Tier1 centers mainly do raw data reprocessing after updated software releases and calibration constants are ready. The Tier2 centers have two major roles: simulation and physics analysis. This talk will describe the software components of the ATLAS data chain and the flow of data from the Tier0 center at CERN to the distributed Tier1, Tier2 and Tier3 centers. There are five major components which will be discussed. The ATLAS Distributed Data Management system, that is responsible for all data movement and registration in ATLAS. The Storage Resource Management system for dealing with heterogeneous local storage systems. The PanDA pilot based system used to run managed production for both simulated data and real data re-processing. The detailed monitoring system (ARDA dashboard monitoring system) which allows us to debug problems. Finally, the systems which allow distributed physics analysis called GANGA and pAthena.
    
    Speaker: Guido Negri (Unknown)
    
    Slides
  - 14:50
    
    The CMS Tier 0 25m
    
    The CMS Tier 0 is responsible for handling the data in the first period of it's life, from being written to a disk buffer at the CMS experiment site in Cessy by the DAQ system, to the time transfer completes from CERN to one of the Tier1 computing centres. It contains all automatic data movement, archival and processing tasks run at CERN. This includes the bulk transfers of data from Cessy to a Castor disk pool at CERN, repacking the data into Primary Datasets, storage to tape of and export to the Tier 1 centres. It also includes a first reconstruction pass over all data and and the tape archival and export to the Tier1 centres of the reconstructed data. While performing these tasks, the Tier 0 has to maintain redundant copies of the data and flush it through the system within a narrow time window to avoid data loss. With data taking being imminent, this aspect of the CMS computing effort becomes of the upmost importance. We discuss and explain here the work developing and commissioning the CMS Tier0 undertaken over the last year.
    
    Speaker: Ian Fisk (Fermi National Accelerator Laboratory (FNAL))
  - 15:15
    
    Data transfer infrastructure for CMS data taking 25m
    
    The CMS PhEDEx (Physics Experiment Data Export) project is responsible for facilitating large-scale data transfers across the grid ensuring transfer reliability, enforcing data placement policy, and accurately reporting results and performance statistics. The system has evolved considerably since its creation in 2004, and has been used daily by CMS since then. Currently CMS tracks over 2 PB of data in PhEDEx, and it has been tested well beyond the requirements of CMS. Over the past year PhEDEx has evolved considerably, making use of new technologies (chiefly POE, an asynchronous, event-driven, cooperative-multitasking framework) and to consolidate the various components such that it is easy to reuse existing techniques and components in new features. This has resulted in changes to nearly every piece of the PhEDEx code base, creating a flexible modular framework. We are able to evolve the implementation to match changes in the requirements of the experiment, without changing the fundamental design. Two major new features have recently been added to the PhEDEx system; an extensible data service and an improved transfer back-end module. The extensible data service provides machine-readable data over HTTP as the primary means of integration with other CMS services. An authenticated command line interface is also provided, making it possible to provide new utilities quickly with minimal development effort. The new transfer back-end module now integrates closely with FTS, the glite provided transfer tool, to provide accurate status information while keeping as much data in flight as possible. The new transfer back-end is transfer technology independent, and we expect to be able to support new transfer tools as they become available. We describe the CMS PhEDEx system that is in place for CMS "first data taking" in 2008, provide details on the benefits and implementations of the new features, and describe other new tools that are now available.
    
    Speaker: Mr Ricky Egeland (University of Minnesota – Twin Cities, Minneapolis, MN, USA)
    
    Poster
- 14:00 → 18:15
  Data Analysis - Algorithms and Tools
  - 14:00
    
    VISPA: a Novel Concept for Visual Physics Analysis 25m
    
    VISPA is a novel graphical development environment for physics analysis, following an experiment-independent approach. It introduces a new way of steering a physics data analysis, combining graphical and textual programming. The purpose is to speed up the design of an analysis, and to facilitate its control. As the software basis for VISPA the C++ toolkit Physics eXtension Library (PXL) is used which is a successor project of the Physics Analysis eXpert (PAX) package. The most prominent features of this toolkit are the management of relations, a copyable container holding different aspects of physics events, the ability to store arbitrary user data, and a fast I/O. In order to support modular physics analysis, VISPA provides a module handling system using the above mentioned event container as the interface. Several analysis modules are provided, e.g. a module for automated reconstruction of particle cascades. All modules can be steered through Python scripts. Physicists can easily write their own modules to the module handling system or extend the existing ones. The concept of VISPA will be presented. Some application examples for different physics analyses will be shown.
    
    Speaker: Tatsiana Klimkovich (RWTH-Aachen)
    
    Slides
  - 14:25
    
    Petaminer: Efficient Navigation to Petascale Data Using Event-Level Metadata 25m
    
    HEP experiments at the LHC store petabytes of data in ROOT files described with TAG metadata. The LHC experiments have challenging goals for efficient access to this data. Physicists need to be able to compose a metadata query and rapidly retrieve the set of matching events. Such skimming operations will be the first step in the analysis of LHC data, and improved efficiency will facilitate the discovery process by permitting rapid iterations of data evaluation and retrieval. Furthermore, efficient selection of LHC data helps enable the tiered data distribution system adopted by LHC experiments, in which massive raw data resides at a few central sites, while higher quality, smaller scale skimmed data is replicated at many lower tier sites with more modest computational resources. To address this problem, we are developing a custom MySQL storage engine to enable the MySQL query processor to directly access TAG data stored in ROOT TTrees. As ROOT TTrees are column-oriented, reading them directly will provide improved performance over traditional row-oriented TAG databases. In addition, to the efficient SQL query interface to the data stored in ROOT TTrees, the Petaminer technology will enable rich MySQL index-building capabilities to add indices to the data in ROOT TTrees, providing further optimization to TAG query performance. Column-oriented databases are an emerging technique for achieving higher performance than traditional row-oriented databases, especially in large scale data-mining scenarios. We will present first results of our feasibility studies of creating a column-oriented MySQL storage engine that enables MySQL to access TAG metadata directly from ROOT files.
    
    Speaker: Alexandre Vaniachine (Argonne National Laboratory)
    
    Slides
  - 14:50
    
    Interactive Data Analysis with PROOF, Experience at GSI 25m
    
    This presentation discusses activities at GSI to support interactive data analysis for the LHC experiment ALICE. GSI is a tier-2 centre for ALICE. One focus is a setup where it is possible to dynamically switch the resources between jobs from the Grid, jobs from the local batch system and the GSI Analysis Facility (GSIAF), a PROOF farm for fast interactive analysis. The second emphasis is on creating PROOF clusters on demand - on a batchfarm or on the Grid. Our experience with PROOF has also allowed us to develop some additional features, that simplify PROOF analysis for users and make it even more interactive.
    
    Speaker: Anna Kreshuk (GSI)
    
    Slides
  - 15:15
    
    C++ and Data 25m
    
    High performance computing with a large code base and C++ has proved to be a good combination. But when it comes to storing data, C++ is a really bad choice: it offers no support for serialization, type definitions are amazingly complex to parse, and the dependency analysis (what does object A need to be stored?) is incredibly difficult. Nevertheless, the LHC data consists of C++ objects that are serialized with help from ROOT's interpreter CINT. The fact that we can do it on that scale, and the performance with which we do it makes this approach unique and stirs interest even outside HEP. I will show how CINT collects and stores information about C++ types, what the current major challenges are (dictionary size!), and what CINT and ROOT have done and plan to do about it.
    
    Speaker: Axel Naumann (CERN)
    
    Slides
  - 15:40
    
    Coffee Break 30m
  - 16:10
    
    Alignment of the ATLAS Inner Detector tracking system 25m
    
    The CERN's Large Hadron Collider (LHC) is the world largest particle accelerator. It will collide two proton beams at an unprecedented center of mass energy of 14 TeV and first colliding beams are expected during summer 2008. ATLAS is one of the two general purpose experiments that will record the decay products of the proton-proton collisions. ATLAS is equipped with a charge particle tracking system built on two technologies: silicon and drift tube based detectors, composing the ATLAS Inner Detector (ID). The performance of the Inner Detectors has to be optimized in order to achieve the ATLAS physics goals. The alignment of the tracking system poses a challenge as one should solve a linear equation with almost 36000 degrees of freedom. The required precision for the alignment of the most sensitive coordinates of the silicon sensors is just few microns. This limit comes from the requirement that the misalignment should not worsen the resolution of the track parameter measurements by more than 10%. Therefore the alignment of the ATLAS ID requires complex algorithms with extensive CPU and memory usage. So far the proposed alignment algorithms are exercised on several applications. We will present the outline of the alignment approach and results from a Combined Test Beam, Cosmic Ray runs and large scale computing simulation of physics samples mimicking the ATLAS operation during real data taking. For the later application the trigger of the experiment is simulated and the event filter is applied in order to produce an alignment input data stream. The full alignment chain is tested using that stream and alignment constants are produced and validated within 24 hours. Cosmic ray data serves to produce an early alignment of the real ATLAS Inner Detector even before the LHC start up. Beyond all tracking information, the assembly survey data base contains essential information in order to determine the relative position of one module with respect to its neighbors. Finally a hardware system measuring an array of grid lines in the modules support structure with a Frequency Scan Interferometer monitors short time system deformations.
    
    Speaker: Mr John Alison (Department of Physics and Astronomy, University of Pennsylvania)
    
    Slides
  - 16:35
    
    Recent Improvements of the ROOT Fitting and Minimization Classes 25m
    
    Advanced mathematical and statistical computational methods are required by the LHC experiments for analyzing their data. Some of these methods are provided by the Math work package of the ROOT project, a C++ Object Oriented framework for large scale data handling applications. We present in detail the recent developments of this work package, in particular the recent improvements in the fitting and minimization classes, which have been re-designed and re-implemented with an object-oriented approach. New minimization algorithms have been added recently in ROOT and they can be used consistently for fitting via a common interface. These algorithms include Minuit2, the new OO version of Minuit, various minimization methods from the GNU Scientific libraries, stochastic and genetic algorithms. Furthermore, a new graphics user interface have been also developed for performing and monitoring fits on ROOT data objects such as histograms, graphs and threes in both one or multi-dimensions. We will describe in detail the new capabilities provided by the new fitting and minimization classes and the functionality of the new user interface.
    
    Speaker: Dr Lorenzo Moneta (CERN)
    
    Slides
  - 17:00
    
    Efficient Level 2 Trigger System Based on Artificial Neural Networks 25m
    
    The HESS project is a major international experiment currently performed in gamma astronomy. This project relies on a system of four Cherenkov telescopes enabling the observation of cosmic gamma rays. The outstanding performance obtained so far in the HESS experiment has led the research labs involved in this project to improve the existing system: an additional telescope is currently being built and will soon take place within the previous telescope system. This telescope is designed to be more sensitive to the detection of low energy particles than the others, leading to an increase of the number of collected particle images. In this context which is tightly constrained in terms of latency, physicists have been compelled to design an additional L2 Trigger in order to deal with a huge amount of data. This trigger aims at selecting images of interest (ie. gamma particles) and rejecting all other events that are associated to noise. Contrary to classical methods that consist of strong cuts based on Hillas parameters, we propose an original approach based on artificial neural networks. In this approach, collected events are first handled by a pre-processing level whose purpose consists in applying transformations on incoming images, thus reducing the dimensionality of the problem. It is based on Zernike moments computation that aims to extract the main features of the images and guarantee image invariance in translation and rotation. Zernike moments have also proved to be reliable in terms of their feature representation capability and low noise sensitivity. In a second step, an artificial neural networks ensures the classification of events within two classes (gammas and hadrons), indicating whether to keep the image for future processing or to reject it. In this presentation, we will describe the entire L2-Trigger system and provide some results in terms of classification performances. We will discuss the contribution of neural networks in this type of experiments compare to classical solutions.
    
    Speaker: Ms Sonia Khatchadourian (ETIS - UMR CNRS 8051)
    
    Slides
  - 17:25
    
    Sophisticated algorithms of analysis of spectroscopic data 25m
    
    The accuracy and reliability of the analysis of spectroscopic data depend critically on the treatment in order to resolve strong peak overlaps, to account for continuum background contributions, and to distinguish artifacts to the responses of some detector types. Analysis of spectroscopic data can be divided to 1. estimation of peaks positions (peak searching) 2. fitting of peak regions. One of the most delicate problems of any spectrometric method is that related to the extraction of the correct information out of the spectra sections, where due to the limited resolution of the equipment, the peaks as the main carrier of spectrometric information are overlapping. Conventional methods of peak searching based usually on spectrum convolution are inefficient and fail to separate overlapping peaks. The deconvolution methods can be successfully applied for the determination of positions and intensities of peaks and for the decomposition of multiplets. Several deconvolution algorithms are studied and and their efficiences compared in the contribution. However before the application of deconvolution operation we need to remove the background from spectroscopic data. One of the basic problems in the analysis of the spectra is the separation of useful information contained in peaks from the useless information (background, noise). In order to process data from numerous analyses efficiently and reproducibly, the background approximation must be, as much as possible, free of user-adjustable parameters. Baseline removal, as the first preprocessing step of spectrometric data, critically influences subsequent analysis steps. The more accurately the background is estimated the more presicely we can estimate the existence of peaks. In the contribution we present a new algorithm to determine peak regions and separate them from peak-free regions. Subsequently it allows to propose a new baseline estimation method based on sensitive non-linear iterative peak clipping with automatic local adjusting of width of clipping window. Moreover automatic setting of peak regions can be used to confine intervals of fitting and to fit each region separately.
    
    Speaker: Miroslav Morhac (Institute of Physics, Slovak Academy of Sciences)
    
    Slides
  - 17:50
    
    25m
- 14:00 → 15:40
  Methodology of Computations in Theoretical Physics - Session 1
  - 14:00
    
    Tools for systematic event generator tuning and validation 25m
    
    Event generator programs are a ubiquitous feature of modern particle physics, since the ability to produce exclusive, unweighted simulations of high-energy events is necessary for design of detectors, analysis methods and understanding of SM backgrounds. However --- particularly in the non-perturbative areas of physics simulated by shower+hadronisation event generators --- there are many parameters which must be tuned to experimental data for useful predictions to be obtained. Attempting to globally tune these parameters to a wide range of experimental results is a task much better suited to systematic, computer-based optimisation than the traditional "tweaking by eye" approach. I will present the current status of the Rivet+Professor tuning/validation system, with emphasis on recent tunes of Pythia 6 to event shape, hadron multiplicity and underlying event data from LEP to the Tevatron.
    
    Speaker: Dr Andy Buckley (Durham University)
    
    Slides
  - 14:25
    
    The Monte Carlo generators in CMS 25m
    
    The CMS collaboration supports a wide spectrum of Monte Carlo generator packages in its official production, each of them requiring a dedicated software integration and physics validation effort. We report on the progress of the usage of these external programs with particular emphasis on the handling and tuning of the Matrix Element tools. The first integration tests in a large scale production for a new family of Object Oriented Monte Carlo generators are also reported.
    
    Speaker: Dr Paolo Bartalini (CERN)
    
    Slides
  - 14:50
    
    Development, validation and maintenance of Monte Carlo event generators and generator services in the LHC era 25m
    
    The Generator Services project collaborates with the Monte Carlo generators authors and with the LHC experiments in order to prepare validated LCG compliant code for both the theoretical and the experimental communities at the LHC. On the one side it provides the technical support as far as the installation and the maintenance of the generators packages on the supported platforms is concerned and on the other side it participates in the physics validation of the generators. The libraries of the Monte Carlo generators maintained within this project are currently widely adopted by the LHC collaborations and are used in large scale productions. The existing testing and validation tools are regularly used and the additional ones are being developed, in particular for the new object-oriented generators. The aim of the validation activity is also to participate in the tuning of the generators in order to provide appropriate settings for the proton-proton collisions at the LHC energy level. This talk presents the current status and the future plans of the Generator Services project. The approach used in order to provide tested Monte Carlo generators for the LHC experiments is discussed and some of the testing and validation tools are presented.
    
    Speaker: Dr Mikhail Kirsanov (Institute for Nuclear Research (INR), Moscow)
    
    Slides
  - 15:15
    
    LCG MCDB and HepML, next step to unified interfaces of Monte-Carlo Simulation 25m
    
    In this talk we present a way of making Monte-Carlo simulation chain fully automated.automation Last years there was a need for common place to store sophisticated MC event samples prepared by experienced theorists. Also such samples should be accessible in some standard manner to be easyly imported and used in experiments' software. The main motivation behind the LCG MCDB project is to make sophisticated MC event samples and their structured descriptions available for various groups of physicists working on LHC. All the data from MCDB is accessible for end-users in several convenient ways from Grid, on the Web and via application program interface. Developed in collaboration of LCG MCDB and CEDAR teams and several MC generator authors, HepML (High Energy Markup Language) is aimed be a unified XML description of event samples simulated by Matrix Element (ME) generators. The other main purpose of HepML is to keep MC generation parameters for further MC generators tuning. It is possible to extend HepML as an XML standard to keep necessary information for the different levels of simulation in HEP, from theoretical model to a simulation of detector responds. HepML provides the possibility to use and develop many standard tools for the comparison, validation, graphical representation of the results and create transparent unified interfaces for the different software in HEP on the modern level of Computer science. Using MCDB and HepML together gives a possibility of automation of such significant part of MC simulation chain as correct transfer physics events from Matrix Element generators to Shower generators and then to detector simulation. Such machine-driven manner allows to avoid errors coming from human factor (physical data are storing with complete unified description directly from MC generator), save a lot of time and efforts for end users of trusted and verified shared MC samples. LCG MCDB is developing within CERN LCG Application Area Simulation Project. This talk is given on behalf of the Generator Services subproject.
    
    Speaker: Mr Sergey Belov (JINR, Dubna)
    
    Slides
- 15:40 → 16:10
  
  Coffee Break 30m
- 15:40 → 16:10
  
  Coffee Break 30m
- 16:10 → 18:15
  Computing Technology for Physics Research - Session 2
  - 16:10
    
    Early Experience with the CMS Computing Model 25m
    
    In this presentation we will discuss the early experience with the CMS computing model from the last large scale challenge activities to the first days of data taking. The current version of the CMS computing model was developed in 2004 with a focus on steady state running. In 2008 a revision of the model was made to concentrate on the unique challenges associated with the commission period. The types of changes needed for commissioning will be presented. In addition we will discuss the challenges in commissioning the major processing workflows for analysis and production and the workflows for data management and data consistency across the distributed computing infrastructure. We will present experiences and results from the final scalability tests of services and techniques. We will also discuss the initial experience with active users and real data from cosmic and collision running. We will address the issues that worked well in addition to identifying areas where future development and refinement is needed.
    
    Speaker: Dr Ian Fisk (Fermi National Accelerator Laboratory (FNAL))
    
    Slides
  - 16:35
    
    Using constraint programing to resolve the multi-source / multi-site data movement paradigm on the Grid 25m
    
    In order to achieve both fast and coordinated data transfer to collaborative sites as well as to create a distribution of data over multiple sites, efficient data movement is one of the most essential aspects in distributed environment. With such capabilities at hand, truly distributed task scheduling with minimal latencies would be reachable by internationally distributed collaborations (such as ones in HENP) seeking for scavenging or maximizing on geographically spread computational resources. But it is often not all clear (a) how to move data when available from multiple sources or (b) how to move data to multiple compute resources to achieve an optimal usage of available resources. Constraint programming (CP) is a technique from artificial intelligence and operations research allowing to find solutions in a multi-dimensional space of variables. We present a method of creating a CP model consisting of sites, links and their attributes such as bandwidth for grid network data transfer also considering user tasks as part of the objective function for an optimal solution. We will explore and explain trade-off between schedule generation time and divergence from the optimal solution and show how to improve and render viable the solution's finding time by using search tree time limit, approximations, restrictions such as symmetry breaking or grouping similar tasks together, or generating sequence of optimal schedules by splitting the input problem. Results of data transfer simulation for each case will also include a well known Peer-2-Peer model, and time taken to generate a schedule as well as time needed for a schedule execution will be compared to a CP optimal solution. We will additionally present a possible implementation aimed to bring a distributed datasets (multiple sources) to a given site in a minimal time.
    
    Speaker: Mr Michal ZEROLA (Nuclear Physics Inst., Academy of Sciences, Praha)
    
    Slides
  - 17:00
    
    Evolution of the STAR Framework OO model for the Multi-Core era 25m
    
    With the era of multi-core CPUs, software parallelism is becoming both affordable as well as a practical need. Especially interesting is to re-evaluate the adaptability of the high energy and nuclear physics sophisticated, but time-consuming, event reconstruction frameworks to the reality of the multi-threaded environment. The STAR offline OO ROOT-based framework implements a well known "standard model" composed of chained modules, where input for each module is the output of the other modules. At its basic principle, modules do not communicate with each other directly and act as consumers and providers of data structures. They use the framework via the special “query” / “publish” API to query the presence of the input data and publish the output results the modules produce. We will show that by complementing the base framework with the ability to start several modules in parallel and synchronize the global data access between the "consumer" modules and "producer", one can transparently enhance the existent packages to leverage the multi-core hardware capability. Such approach allows re-using the existing offline software designed for the single thread batch applications in the multi-threaded environment if needed. However, we realize that the "query"/"publish" paradigm is not sufficient to run the multi-threaded application effectively. It should be complemented with an API to “register” the module output to notify the framework members about an output dataset “to be produced soon”. With such addition, the receiving module thread can be automatically suspended if the data it has requested is not ready yet. We will explain how the STAR Offline Framework was modified to test the present approach by building the sophisticated interactive real time applications.
    
    Speaker: Dr Valeri FINE (BROOKHAVEN NATIONAL LABORATORY)
    
    Slides
  - 17:25
    
    XCFS - an analysis disk pool & filesystem based on FUSE and xroot protocol 25m
    
    One of the biggest challenges in LHC experiments at CERN is data management for data analysis. Event tags and iterative looping over datasets for physics analysis require many file opens per second and (mainly forward) seeking access. Analyses will typically access large datasets reading terabytes in a single iteration. A large user community requires policies for space management and a highly performant, scalable, fault-tolerant and higly available system to store user data. While batch job access for analysis can be done using remote protocols experiment users expressed a need for a direct filesystem integration of their analysis (output) data to support file handling via standard unix tools, browsers,scripts etc. XCFS - the xroot based catalog file system is an attempt to implement the above ideas based on xroot protocol. The implementation is done via a filesystem plugin for FUSE using the xroot posix client library which has been tested on LINUX and MAC OSX platform. Filesystem meta data is stored on the head node in a XFS filesystem using sparse files and extended attributes. XCFS provides synchronous replica creation during write operations, a distributed unix quota system, krb5/gsi and voms authentication with support for secondary groups (via xroot remote protocol and through the mounted filesystem). High availability of the headnode is achieved using a heartbeat setup and filesystem mirroring using DRBD. The first 80 TB test setup allowing to store a maximum number of 800 million files has shown promising results with thousands of file open and meta data operations per second and saturation of gigabit ethernet executing single 'cp' commands on the mounted file system. The average latency for meta data commands is in the order of ~1ms, for file open operations it is <4ms. The talk will discuss results of typical LHC analysis applications using remote or mounted filesystem access. A comparison will be made between XCFS and other filesystem implementations like AFS or Lustre. Strength and weaknesses of the approach and its possible usage in CASTOR - the CERN mass storage system - will be discussed.
    
    Speaker: Mr Andreas Joachim Peters (CERN)
    
    Slides
  - 17:50
    
    Job Centric Monitoring for ATLAS jobs in the LHC Computing Grid 25m
    
    As the Large Hadron Collider (LHC) at CERN, Geneva, has begun operation in september, the large scale computing grid LCG (LHC Computing Grid) is meant to process and store the large amount of data created in simulating, measuring and analyzing of particle physic experimental data. Data acquired by ATLAS, one of the four big experiments at the LHC, are analyzed using compute jobs running on the grid and utilizing the ATLAS software framework 'Athena'. The analysis algorithms themselves are written in C++ by the physicists using Athena and the ROOT toolkit. Identifying the reason for a job failure (or even the occurance of the failure itself) in this context is a tedious, repetitive and - more often than not - unsuccessful task. Often, to deal with failures in the RUNNING stage (as opposed to job submission failures or compilation errors in the user algorithms), the job is just being resubmitted. The debugging of such problems is made even more difficult by the fact that the output-sandbox, which contains the jobs' output and error logs, is discarded by the grid middleware if the job failed. So, valuable information that could aid in finding the failure reason is lost. These issues result in high job failure rates and less than optimal resource usage. As part of the High Energy Particle Physics Community Grid project (HEPCG) of the German D-Grid Initiave, the University of Wuppertal has developed the Job Execution Monitor (JEM). JEM helps finding job failure reasons by two means: It periodically provides vital worker node system data and collects job run-time monitoring data. To gather this data, a supervised line-by-line execution of the user job is performed. JEM is providing new possibilities to find problems in largely distributed computing grids and to analyze these problems in nearly real-time. All monitored information is presented to the user almost instantaneously and additionally stored in the jobs' output sandbox for further analysis. As a first step, JEM has been seamlessly integrated into ATLAS' and LHCb's grid user interface 'ganga'. In this way, submitted jobs are monitored transparently, requiring no additional effort by the user. In this work, the functionality of and the concepts behind JEM are presented together with examples of typical problems that are easily discovered. Furthermore, we present an ongoing work of classifying problems automatically using expert systems.
    
    Speaker: Mr Tim Muenchen (Bergische Universitaet Wuppertal)
    
    Slides
- 16:10 → 18:15
  Methodology of Computations in Theoretical Physics - Session 2
  - 16:10
    
    Hadronic Physics in Geant4: Improvements and Status for LHC Start 25m
    
    An overview of recent developments for the Geant4 hadronic modeling is provided with a focus on the start of the LHC experiments. Improvements in Pre-Compound model, Binary and Bertini cascades, models of elastic scattering, quark-gluon string and Fritiof high energy models, and low-energy neutron transport were introduced using validation versus data from thin target experiments. Many of these developments were directed to improve simulation of hadronic showers for LHC. As a result, starting from Geant4 8.3, the Physics List QGSP_BERT describes reasonably well all the main observables that have been measured in different test-beam setups for ATLAS and CMS experiments.
    
    Speaker: Prof. Vladimir Ivantchenko (CERN, ESA)
    
    Slides
  - 16:35
    
    Two approaches to Combining Significances 25m
    
    We compare two approaches to the combining of signal significances: the approach, in which the signal significances are considered as corresponding random variables, and the approach with the using of confidence distributions. Several signal significances, which are often used in analysis of data in experimental physics as a measure of excess of the observed or expected number of signal events above the predicted number of background events, are considered.
    
    Speaker: Dr Sergey Bityukov (INSTITUTE FOR HIGH ENERGY PHYSICS, PROTVINO)
    
    Slides
  - 17:00
    
    Radiative corrections to Drell-Yan like processes in SANC 25m
    
    Radiative corrections to processes of single Z and W boson production are obtained within the SANC computer system. Interplay of one-loop QCD and electroweak corrections is studied. Higher order QED final state radiation is taken into account. Monte Carlo event generators at the hadronic level are constructed. Matching with general purpose programs like HERWIG and PYTHIA is performed to include the effect of partonic showers. Numerical results for LHC conditions are demonstrated. The resulting theoretical uncertainty in the description of these processes is discussed.
    
    Speaker: Dr Andrej Arbuzov (Joint Institute for Nuclear Research (JINR))
    
    Slides
  - 17:25
    
    50m
Wednesday 5 November
- Mon 3 Nov
- Tue 4 Nov
- Wed 5 Nov
- Thu 6 Nov
- Fri 7 Nov
- 09:00 → 12:00
  Wednesday, 05 November 2008
  - 09:00
    
    CompHEP status report (version 4.5) 40m
    
    We present a new version of the CompHEP program package, version 4.5. We describe shortly new techniques and options implemented: interfaces to ROOT and HERWIG, generation of the XML-based header in event files (HepML), full implementation of Les Houches agreements (LHA I, SUSY LHA, LHA PDF, Les Houches events), realisation of the improved von Neumann procedure for the event generation, etc. We also mention a few concrete and physically motivated examples of CompHEP-based event generators, which are important for the LHC experiments.
    
    Speaker: Dr Alexander Sherstnev (University of Oxford)
    
    Slides
  - 09:40
    Multivariate Methods in Particle Physics: Today and Tomorrow 40m
    
    ultivariate methods are used routinely in particle physics research to classify objects or to discriminate signal from background. They have also been used successfully to approximate multivariate functions. Moreover, as is evident from this conference, excellent easy-to-use implementations of these methods exist, making it possible for everyone to deploy these sophisticated methods. From time to time, however, it is helpful to step back and reflect a little on what is being done. That is the aim of this talk. I begin with a brief introduction to the kind of problems such methods address and follow with a survey of a few of the most promising recent developments. The talk ends with a discussion of what I consider to be the outstanding issues and the prospects for future developments.
    
    Speaker: Dr Harrison Prosper (Department of Physics, Florida State University)
    
    Slides
    
    Video
    
    fig_BDT_test.avi
    
    fig_BDT_train.avi
    
    fig_ptmet_DT.avi
  - 10:20
    
    Coffee Break 20m
  - 10:40
    
    LHC phenomenology at next-to-leading order QCD: theoretical progress and new results 40m
    
    In this talk I will motivate that a succesful descripton of LHC physics needs the inclusion of higher order corrections for all kinds of signal and background processes. In the case of multi-particle production the combinatorial complexity of standard approaches triggered many new developments which allow for the efficient evaluation of one-loop amplitudes for LHC phenomenology. I will discuss the basic new ideas for one-loop multi-leg computations including comments on computational issues and will review recent results relevant for LHC phenomenology.
    
    Speaker: Dr Thomas Binoth (University of Edinburgh)
    
    Slides
  - 11:20
    
    Code quality from the programmer's perspective 40m
    
    Code quality has traditionally been decomposed into internal and external quality. In this talk, I will discuss the differences between these two views and I will consider the contexts in which either of the two becomes the main quality goal. I will argue that for physics software the programmer's perspective, focused on the internal quality, is the most important one. Then, I will revise the available tools and techniques for the verification and improvement of the internal code quality, having in mind the programmer's perspective. I will conclude with a list of challenges for research in software engineering about aspects of the internal code quality that are largely neglected, but affect deeply the programmer's ability to carry out code modification and bug fixing tasks. Such aspects revolve around the way in which the natural language is embedded into the code as a form of domain modeling.
    
    Speaker: Paolo Tonella (FBK-IRST)
    
    Slides
- 14:00 → 18:15
  Computing Technology for Physics Research
  - 14:00
    
    The commissioning of CMS computing centres in the WLCG Grid 25m
    
    The computing system of the CMS experiment works using distributed resources from more than 80 computing centres worldwide. These centres, located in Europe, America and Asia are interconnected by the Worldwide LHC Computing Grid. The operation of the system requires a stable and reliable behaviour of the underlying infrastructure. CMS has established a procedure to extensively test all relevant aspects of a Grid site, such as the ability to efficiently use their network to transfer data, the functionality of all the site services relevant for CMS and the capability to sustain the various CMS computing workflows (Monte Carlo simulation, event reprocessing and skimming, data analysis) at the required scale. This contribution describes in detail the procedure to rate CMS sites depending on their performance, including the complete automation of the program, the description of monitoring tools, and its impact in improving the overall reliability of the Grid from the point of view of the CMS computing system.
    
    Speaker: Dr Andrea Sciaba' (CERN, Geneva, Switzerland)
    
    Slides
  - 14:25
    
    The DAQ/HLT system of the ATLAS experiment 25m
    
    The DAQ/HLT system of the ATLAS experiment at CERN, Switzerland, is being commissioned for first collisions in 2009. Presently, the system is composed of an already very large farm of computers that accounts for about one-third of its event processing capacity. Event selection is conducted in two steps after the hardware-based Level-1 Trigger: a Level-2 Trigger processes detector data based on regions of interest (RoI) and an Event Filter operates on the full event data assembled by the Event Building system. The detector readout is fully commissioned and can be operated at its full design capacity. This places on the High-Level Triggers system the responsibility to maximize the quality of data that will finally reach the offline reconstruction farms. This paper brings an overview of the current ATLAS DAQ/HLT implementation and performance based on studies originated from its operation with simulated, cosmic particles and first-beam data. Its built-in event processing parallelism is discussed for both HLT levels as well as an outlook of options to improve it.
    
    Speaker: Dr André dos Anjos (University of Wisconsin, Madison, USA)
    
    Slides
  - 14:50
    
    VARIOUS RUNTIME ENVIRONMENTS IN GRID BY MEANS OF VIRTUALIZATION OF WORKING NODES 25m
    
    Grid systems are used for calculations and data processing in various applied areas such as biomedicine, nanotechnology and materials science, cosmophysics and high energy physics as well as in a number of industrial and commercial areas. However, one of the basic problems costing on a way to wide use of grid systems is related to the fact that applied jobs, as a rule, are developed for execution in a definite runtime environment specified by type and version of operating systems, auxiliary software (libraries), type of file system, presence or absence of facilities for parallel computing, etc. On the other hand, working nodes in the resource grid centers (where the jobs are executed) operate under control of a certain OS and offer a fixed runtime environment. Therefore if applied jobs were not developed initially for the particular runtime environment of the WNs, they cannot be directly processed in the grid. In the framework of this work an approach [1] for batch processing of computer jobs prepared for various runtime environments in grid is proposed. This method is based on the virtualization of working nodes of grid resource centers and enables executing applied jobs irrespective of runtime environment it has been initially developed for. In particular, jobs developed for execution in the environment of the widespread OS Windows, can be processed in resource centers of the grid system whose working nodes operate under OS Linux. The realization of the proposed approach was made under gLite MW in the EGEE/WLCG project and was successfully tested in SINP MSU resource center. [1] V.A.Ilyin, A.P.Kryukov, L.V.Shamardin, A.P.Demichev, I.N.Gorbunov, A method for submitting and processing jobss prepared for various runtime environments in grid, Numerical Methods and Programming, v. 9, pp. 41-47, 2008 (in Russian)
    
    Speaker: Alexander Kryukov (Skobeltsyn Institute for Nuclear Physics Moscow State University)
    
    Slides
  - 15:15
    
    The Advanced Resource Connector for Distributed LHC Computing 25m
    
    The NorduGrid collaboration and its middleware product, ARC (the Advanced Resource Connector), span institutions in Scandinavia and several other countries in Europe and the rest of the world. The innovative nature of the ARC design and flexible, lightweight distribution make it an ideal choice to connect heterogeneous distributed resources for use by HEP and non-HEP applications alike. ARC has been used by scientific projects for many years and through experience it has been hardened and refined to a reliable, efficient software product. In this paper we present the architecture and design of ARC and show how ARC's simplicity eases application integration and facilitates taking advantage of distributed resources. Example applications are shown along with some results from one particular application, simulation production and analysis for the ATLAS experiment, as an illustration of ARC's common usage today. These results demonstrate ARC's ability to handle significant fractions of the computing needs of the LHC experiments today and well into the future.
    
    Speaker: David Cameron (University of Oslo)
    
    Slides
  - 15:40
    
    Coffee Break 30m
  - 16:35
    
    FairRoot Framework 25m
    
    The new development in the FairRoot framework will be presented. FairRoot is the simulation and anaysis framework used by CBM and PANDA at FAIR/GSI experiments. The CMake based building and testing system will be described. A new event display based on EVE-package from ROOT and Geane will be shown, also the new developments for using GPUs and multi-core systems will be discussed.
    
    Speaker: Dr Mohammad Al-Turany (GSI DARMSTADT)
    
    Slides
  - 17:00
    
    The CMS Framework for Alignment and Calibration 25m
    
    The ultimate performance of the CMS detector relies crucially on precise and prompt alignment and calibration of its components. A sizable number of workflows need to be coordinated and performed with minimal delay through the use of a computing infrastructure which is able to provide the constants for a timely reconstruction of the data for subsequent physics analysis. The framework supporting these processes and results from testing it in recent commissioning campaigns are presented.
    
    Speaker: Gero Flucke (Universität Hamburg)
    
    Slides
  - 17:25
    
    A prototype of a dinamically expandable Virtual Analysis Facility 25m
    
    Current Grid deployments for LHC computing (namely the WLCG infrastructure) do not allow efficient parallel interactive processing of data. In order to allow physicists to interactively access subsets of data (e.g. for algorithm tuning and debugging before running over a full dataset) parallel Analysis Facilities based on PROOF have been deployed by the ALICE experiment at CERN and elsewhere. Whereas large Tier-1 centres can afford to build such facilities at the expense of their Grid farms, this is likely not to be true for smaller Tier-2s centres. Leveraging on the virtualisation of highly performant multi-core machines, it is possible to build a fully virtual Analysis Facility on the same Worker Nodes that compose an existing LCG Grid Farm. Using the Xen paravirtualisation hypervisor, it is then possible to dynamically move resources from the batch instance to the interactive one when needed. We present the status of the prototype being developed.
    
    Speaker: Dario Berzano (Istituto Nazionale di Fisica Nucleare (INFN) and University of Torino)
    
    Slides
  - 17:50
    
    Software development, release integration and distribution tools for the CMS experiment 25m
    
    The offline software suite of the CMS experiment must support the production and analysis activities across a distributed computing environment. This system relies on over 100 external software packages and includes the developments of more than 250 active developers. This system requires consistent and rapid deployment of code releases, a stable code development platform, and efficient tools to enable code development and production work across the facilities utilized by the experiment. Recent developments have resulted in significant improvements in these areas. We report the concept, status, recent improvements and future plans for these aspects of the CMS offline software environment.
    
    Speaker: David Lange (LLNL)
- 14:00 → 18:15
  Data Analysis - Algorithms and Tools
  - 14:00
    
    A Numeric Comparison of Feature Selection Algorithms for Supervised Learning 25m
    
    Datasets in modern High Energy Physics (HEP) experiments are often described by dozens or even hundreds of input variables (features). Reducing a full feature set to a subset that most completely represents information about data is therefore an important task in analysis of HEP data. We compare various feature selection algorithms for supervised learning using several datasets such as, for instance, imaging gamma-ray Cherenkov telescope (MAGIC) data found at the UCI repository. We use classifiers and feature selection methods implemented in the statistical package StatPatternRecognition (SPR), a free open-source C++ package developed in the HEP community (http://sourceforge.net/projects/statpatrec/). For each dataset, we select a powerful classifier and estimate its learning accuracy on feature subsets obtained by various feature selection algorithms. When possible, we also estimate the CPU time needed for the feature subset selection. The results of this analysis are compared with those published previously for these datasets using other statistical packages such as R and Weka. We show that the most accurate, yet slowest, method is a wrapper algorithm known as generalized sequential forward selection ("Add N Remove R") implemented in SPR.
    
    Speaker: Dr Giulio Palombo (University of Milan - Bicocca)
    
    Slides
  - 14:25
    
    Tau identification using multivariate techniques in ATLAS 25m
    
    Tau leptons will play an important role in the physics program at the LHC. They will not only be used in electroweak measurements and in detector related studies like the determination of the E_T^miss scale, but also in searches for new phenomena like the Higgs boson or Supersymmetry. Due to the overwhelming background from QCD processes, highly efficient algorithms are essential to identify hadronically decaying tau leptons. This can be achieved using modern multivariate techniques which make optimal use of all the information available. They are particularly useful in case the discriminating variables are not independent and no single variable provides good signal and background separation. In ATLAS four algorithms based on multivariate techniques have been applied to identify hadronically decaying tau leptons: projective likelihood estimator (LL), Probability Density Estimator with Range Searches (PDE-RS), Neural Network (NN) and Boosted Decision Trees (BDT). All four multivariate methods applied to the ATLAS simulated data have similar performance, which is significantly better than the baseline cut analysis.
    
    Speaker: Dr Marcin Wolter (Henryk Niewodniczanski Institute of Nuclear Physics PAN)
    
    Slides
  - 14:50
    
    The ALICE Global Redirector. A step towards real storage robustness. 25m
    
    In this talk we address the way the ALICE Offline Computing is starting to exploit the possibilities given by the Scalla/Xrootd repository globalization tools. These tools are quite general and can be adapted to many situations, without disrupting existing designs, but adding a level of coordination among xrootd-based storage clusters, and the ability to interact between them.
    
    Speaker: Dr Fabrizio Furano (Conseil Europeen Recherche Nucl. (CERN))
    
    Slides
  - 15:15
    
    WebDat: Bridging the Gap Between Unstructured and Structured Data 25m
    
    Accelerator R&D environments produce data characterized by different levels of organization. Whereas some systems produce repetitively predictable and standardized structured data, others may produce data of unknown or changing structure. In addition, structured data, typically sets of numeric values, are frequently logically connected with unstructured content (e.g., images, graphs, comments). Despite these different characteristics, a coherent, organized and integrated view of all information is sought out. WebDat is a system conceived as a result of efforts in this direction. It provides a uniform and searchable view of structured and unstructured data via common metadata, regardless of the repository used (DBMS or file system). It also allows for processing data and creating interactive reports. WebDat supports metadata management, administration, data and content access, application integration via Web services, and Web-based collaborative analysis.
    
    Speaker: Dr Jerzy Nogiec (FERMI NATIONAL ACCELERATOR LABORATORY)
    
    Slides
  - 15:40
    
    Coffee Break 30m
  - 16:10
    
    MINUIT Package Parallelization and applications using the RooFit Package 25m
    
    MINUIT is the most common package used in high energy physics for numerical minimization of multi-dimensional functions. The major algorithm of this package, MIGRAD, searches for the minimum by using the function gradient. For each minimization iteration, MIGRAD requires the calculation of the first derivatives for each parameter of the function to be minimized. In this presentation we will show how the algorithm can be easily parallelized using MPI techniques to scale over multiple nodes or multi-threads for multi-cores in a single node. We will present the speed-up improvements obtained in typical physics applications such as complex maximum likelihood fits using the RooFit package. Furthermore, we will also show results of hybrid parallelization between MPI and multi-threads, to take full advantage of multi-core architectures.
    
    Speaker: Dr Alfio Lazzaro (Universita' degli Studi and INFN, Milano)
    
    Slides
  - 16:35
    
    ATLAS trigger status and results from commissioning operations 25m
    
    The ATLAS trigger system is designed to select rare physics processes of interest from an extremely high rate of proton-proton collisions, reducing the LHC incoming rate of about 10^7. The short LHC bunch crossing period of 25 ns and the large background of soft-scattering events overlapped in each bunch crossing pose serious challenges, both on hardware and software, that the ATLAS trigger must overcome in order to efficiently select interesting events. The ATLAS trigger consists of hardware based Level-1, and a two-level software based High-Level Trigger (HLT). Data bandwidth and processing times in the higher level triggers are reduced by region of interest guidance in the HLT reconstruction steps. High flexibility is critical in order to adapt to the changing luminosity, backgrounds and physics goals. This is achieved by inclusive trigger menus and modular software design. Selection algorithms have been developed which provide the required elasticity to detect different physics signatures and to control the trigger rates. In this talk an overview of the ATLAS trigger design, status and expected performance, as well as the results from the on-going commissioning with cosmic rays and first LHC beams, is presented.
    
    Speaker: Dr Biglietti Michela (UNIVERSITY OF NAPOLI and INFN)
    
    Slides
  - 17:00
    
    Software Validation Infrastructure for the Atlas High-Level Trigger Validation Infrastructure for the ATLAS High-Level Trigger 25m
    
    The ATLAS trigger system is responsible for selecting the interesting collision events delivered by the Large Hadron Collider(LHC). The ATLAS trigger will need to achieve a ~10‐7 rejection factor against random proton‐proton collisions, and still be able to efficiently select interesting events. After a first processing level based on FPGAs and ASICS, the final event selection is based on custom software running on two CPU farms, containing around two thousand multi‐core machines. This is known as the high‐level trigger(HLT).
    
    With more than 100 contributors and around 250 different packages, a thorough validation of the HLT software is essential. This paper describes the existing infrastructure used for validating the HLT software, as well as future plans.
    
    Speaker: Mr Danilo Enoque Ferreira De Lima (Federal University of Rio de Janeiro (UFRJ) - COPPE/Poli)
    
    Slides
  - 17:25
    
    Application of the rule-growing algorithm RIPPER to particle physics analysis 25m
    
    A large hadron machine like the LHC with its high track multiplicities always asks for powerful tools that drastically reduce the large background while selecting signal events efficiently. Actually such tools are widely needed and used in all parts of particle physics. Regarding the huge amount of data that will be produced at the LHC, the process of training as well as the process of applying these tools to data, must be time efficient. Such tools can be multi-variate analysis -- also called data mining -- tools. In this talk we present the results for the application of the multi-variate analysis, rule growing algorithm RIPPER on a problem of particle selection. Minimum-bias Monte-Carlo data for the LHCb-Experiment are used. It turns out that the meta-methods bagging and cost-sensitivity are essential for the quality of the outcome. The results are compared to other multi-variate analysis techniques as well as to the traditional cuts based method.
    
    Speaker: Dr Markward Britsch (Max-Planck-Institut fuer Kernphysik (MPI)-Unknown-Unknown)
    
    Slides
  - 17:50
    
    Enhanced Gene Expression Programming for signal-background discrimination in particle physics 25m
    
    In order to address the data analysis challenges imposed by the complexity of the data generated by the current and future particle physics experiments, new techniques for performing various analysis tasks need to be investigated. In 2006 we introduced to the particle physics field one such new technique, based on Gene Expression Programming (GEP), and successfully applied it to an event selection problem. While GEP, as initially proposed, was proven to be more flexible and more efficient than other evolutionary algorithms, it does not incorporate many of the advanced developments in the field of evolutionary computation. This paper will present our developments of the algorithm and will discuss results obtained with alternative mapping mechanisms between the solution space and the representation space, the effect of a more controlled selection process of the candidate solutions, and of adaptable discrimination thresholds for supervised classification problems. The enhanced version of the algorithm was applied to a signal-background discrimination problem in a particle physics data analysis. Comparative studies of the initial and the enhanced version of GEP were performed and the results will be presented and discussed.
    
    Speaker: Liliana Teodorescu (Brunel University)
    
    Slides
- 14:00 → 15:40
  Methodology of Computations in Theoretical Physics - Session 1
  - 14:00
    
    Current status of FORM parallelization 25m
    
    We report on the status of the current development in parallelization of the symbolic manipulation system FORM. Most existing FORM programs will be able to take advantage of the parallel execution, without the need for modifications.
    
    Speaker: Mikhail Tentyukov (Karlsruhe University)
    
    Slides
  - 14:25
    
    New implementation of the sector decomposition on FORM 25m
    
    Nowadays the sector decomposition technique, which can isolate divergences from parametric representations of integrals, becomes quite useful tool for numerical evaluations of the Feynman loop integrals. It is used to verify the analytical results of multi-loop integrals in the Euclidean region, or in some cases practically used in the physical region by combining with other methods handling the threshold. In an intermediate stage of the sector decomposition for the multi-loop integrals, one often have to handle enormously large expressions containing tons of terms. The symbolic manipulation system FORM is originally designed to treat such a huge expressions and has strong advantage for it. In this talk, the implementation of the sector decomposition algorithm on FORM is discussed. A number of concrete examples including cases of multi-loop diagrams are also shown.
    
    Speaker: Dr Takahiro Ueda (KEK)
    
    Slides
  - 14:50
    
    FormCalc 6 25m
    
    The talk will cover the latest version of the Feynman-diagram calculator FormCalc. The most significant improvement is the communication of intermediate expressions from FORM to Mathematica and back, for the primary purpose of introducing abbreviations at an early stage. Thus, longer expressions can be treated and a severe bottleneck in particular for processes with high multiplicities removed.
    
    Speaker: Thomas Hahn (MPI Munich)
    
    Slides
  - 15:15
    
    Numerical Evaluation of Feynman Integrals by a Direct Computation Method 25m
    
    We apply a 'Direct Computation Method', which is purely numerical, to evaluate Feynman integrals. This method is based on the combination of an efficient numerical integration and an efficient extrapolation strategy. In addition, high-precision arithmetic and parallelization techniques can be used if required. We present our recent progress in the development of this method and show test results such as for one-loop 5-point and two-loop 3-point integrals.
    
    Speaker: Dr Fukuko YUASA (KEK)
    
    Slides
- 15:40 → 16:10
  
  Coffee Break 30m
- 16:10 → 18:15
  Methodology of Computations in Theoretical Physics - Session 2
  - 16:10
    
    Numerical calculations of Multiple Polylog functions 25m
    
    Multiple Polylog functions (MPL) often appear as a result of the Feynman parameter integrals in higher order correction in quantum field theory. Numerical evaluation of the MPL with higher depth and weight is necessary for multi-loop calculations. We propose a purely numerical method to evaluate MPL using numerical contour integral in multi-parameter complex-plane. We can obtain values of MPL for any complex variables.
    
    Speaker: Dr Yoshimasa Kurihara (KEK)
    
    Slides
  - 16:35
    
    New results for loop integrals 25m
    
    We present some recent results on the evaluation of massive one-loop multileg Feynman integrals, which are of relevance for LHC processes. An efficient complete analytical tensor reduction was derived and implemented in a Mathematica package hexagon.m. Alternatively, one may use Mellin-Barnes techniques in order to avoid the tensor reduction. We shortly report on a new version of the Matheamtica package AMBRE.m.
    
    Speaker: Tord Riemann (DESY)
    
    Slides
  - 17:00
    
    Feynman Diagrams, Differential Reduction and Hypergeometric Functions 25m
    
    Recent results related with manipulation of hypergeometric functions: reduction and construction of higher-order terms in epsilon-expansion is revised. The application of given technique to the analytical evaluation of Feynman diagrams is considered.
    
    Speaker: Dr mikhail kalmykov (Hamburg U./JINR)
    
    Slides
  - 17:25
    
    Round Table - Event generation: are we ready for LHC? 50m
Thursday 6 November
- Mon 3 Nov
- Tue 4 Nov
- Wed 5 Nov
- Thu 6 Nov
- Fri 7 Nov
- 09:00 → 12:40
  Thursday, 06 November 2008
  - 09:00
    
    Getting ready for next generation computing 40m
    
    The ALICE High Level Trigger is a high performance computer, setup to process the ALICE on-line data, exceeding 25GB/sec in real time. The most demanding detector for the event reconstruction is the ALICE TPC. The HLT implements different kinds of processing elements, including AMD, Intel processors, FPGAs and GPUs. The FPGAs perform an on the fly cluster reconstruction and the tracks are planned to be computed on GPUs for speed. The ALICE event reconstruction software is designed from scratch to support multi core architectures. The status of the system- and analysis code architecture, which are optimised for speed are presented. Several design and programming features of the ALICE HLT have been integrated into the plan to build a T2 “Landesrechner” in Frankfurt with up to 3000 processing nodes. The architecture and status of the project will be outlined. The planned system is founding member of the German Gauss Alliance.
    
    Speaker: Prof. Volker Lindenstruth (Kirchhoff Institute for Physics)
  - 09:40
    
    CernVM - a virtual appliance for LHC applications 40m
    
    CernVM is a Virtual Software Appliance to run physics applications from the LHC experiments at CERN. The virtual appliance provides a complete, portable and easy to install and configure user environment for developing and running LHC data analysis on any end-user computer (laptop, desktop) and on the Grid independently of operating system software and hardware platform (Linux, Windows, MacOS). The aim is to facilitate the installation of the experiment software on an user computer and minimize the number of platforms (compiler-OS combinations) on which experiment software needs to be supported and tested thus reducing the overall cost of LHC software maintenance. Two ingredients are necessary for CernVM. The first one is a thin virtual machine that contains 'just enough Operating System' to run any application framework of the four LHC experiments. The second is a file system (cvmfs) specifically designed for an efficient ‘just in time’ software distribution and installation. The CernVM project, which has started at the beginning of this year is funded for period of four years under the recently approved R&D program at CERN.
    
    Speaker: Predrag Buncic (CERN)
    
    Slides
  - 10:20
    
    Coffee Break 20m
  - 10:40
    
    Forget multicore! The future is many-core: An outlook to the explosion of parallelism likely to occur in the LHC era 40m
    
    This talk will start by reminding the audience that Moore's law is very much alive (even after 40+ years of existence). Transistors will continue to double for every new silicon generation every other year. Chip designers are therefore trying every possible "trick" for putting the transistors to good use. The most notable one is to push more parallelism into each CPU: More and longer vectors, more parallel execution units, more cores and more hyperthreading inside each core. In addition highly parallel grphics processing units (GPUs) are also entering the game and compete efficiently with CPUs in several computing fields. The speaker will try to predict the CPU dimensions we will reach during the LHC era, based on what we have seen in the recent past and the projected roadmap for silicon. He will also discuss the impact on HEP software. Can we continue to rely on event-level parallelism at the process levels or do we need to move to a new software paradigm?
    
    Speaker: Mr Sverre Jarp (CERN)
    
    Slides
  - 11:20
    
    Porting Reconstruction Algorithms to the Cell Broadband Engine 40m
    
    On-line processing of large data volumes produced in modern HEP experiments requires using maximum capabilities of the computer architecture. One of such powerful feature is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achieving more operations per clock cycle. The novel Cell processor extends the parallelization further by combining a general-purpose PowerPC processor core with eight streamlined coprocessing elements which greatly accelerate vector processing applications. In order to investigate a possible speed up of the reconstruction stage of data processing, we have ported a track fitting package based on the Kalman filter to the Cell processor. The overall speed up in 120000 times has been obtained on a Cell Blade computer compared to the initial scalar implementation on a Pentium 4 machine. Major steps of the porting procedure (memory optimization, numerical analysis, vectorization with inline operator overloading, and optimization using the Cell simulator) are presented and discussed.
    
    Speaker: Dr Ivan Kisel (Gesellschaft fuer Schwerionenforschung mbH (GSI), Darmstadt, Germany)
    
    Slides
  - 12:00
    
    Throughput Computing in C++ 40m
    
    Power consumption is the ultimate limiter to current and future processor design, leading us to focus on more power efficient architectural features such as multiple cores, more powerful vector units, and use of hardware multi-threading (in place of relatively expensive out-of-order techniques). It is (increasingly) well understood that developers face new challenges with multi-core software development. The first of these challenges is a significant productivity burden particular to parallel programming. A big contributor to this burden is the relative difficulty of tracking down data races, which manifest non-deterministically. The second challenge is parallelizing applications so that they effectively scale with new core counts and the inevitable enhancement and evolution of the instruction set. This is a new and subtle qualifier to the benefits of backwards compatibility inherent in Intel® Architecture (IA): performance may not scale forward with new micro-architectures and, in some cases, actually regress. I assert that forward-scaling is an essential requirement for new programming models, tools, and methodologies intended for multi-core software development. We are implementing a programming model called Ct (C for Throughput Computing) that leverages the strengths of data parallel programming to help address these challenges. Ct is a C++-hosted deterministic parallel programming model integrating the nested data parallelism of Blelloch and bulk synchronous processing of Valiant (with a dash of SISAL for good measure). Ct uses meta-programming and dynamic compilation to essentially embed a pure functional programming language in impure and unsafe C++. A key objective of the Ct project is to create both high-level and low-level abstractions that forward-scale across IA. I will describe the surface API and runtime architecture that we’ve built to achieve this, as well as some performance results
    
    Speaker: Dr Anwar Ghuloum (Intel Corporation)
- 14:00 → 18:00
  Thursday, 06 November 2008
  - 14:00
    
    Many-core Round Table - How to prepare for the future 2h
Friday 7 November
- Mon 3 Nov
- Tue 4 Nov
- Wed 5 Nov
- Thu 6 Nov
- Fri 7 Nov
- 09:00 → 11:30
  Friday, 07 November 2008
  - 09:00
    
    Computing Technology for Physics Research - Summary 30m
    
    Speaker: Dr Ian Fisk (Fermi National Accelerator Laboratory, Batavia, United States)
  - 09:30
    
    Data Analysis - Algorithms and Tools - Summary 30m
    
    Speaker: Dr Thomas Speer (Brown University)
    
    Slides
  - 10:00
    
    Coffee Break 30m
  - 10:30
    
    Methodology of Computations in Theoretical Physics - Summary 30m
    
    Speaker: Prof. Kiyoshi Kato (Kogakuin University)
    
    Slides
  - 11:00
    
    ACAT2008 Summary 30m
    
    Speaker: Mr Federico Carminati (CERN)
    
    Slides

Choose timezone

ACAT 2008

Ettore Majorana Foundation and Centre for Scientific Culture