Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013)

14–18 Oct 2013

Amsterdam, Beurs van Berlage

Europe/Amsterdam timezone

CHEP2013 Logistics Management

info@chep2013.org

Session

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

14 Oct 2013, 13:30

Amsterdam, Beurs van Berlage

Amsterdam, Beurs van Berlage

Damrak 243 1012 ZJ AMSTERDAM

There are no materials yet.

8. Towards a Global Service Registry for the World-wide LHC Computing Grid

Alessandro Di Girolamo (CERN)

14/10/2013, 13:30

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The WLCG information system is just one of the many information sources that are required to populate a VO configuration database. Other sources include central portals such as the GOCDB and the OIM from EGI and OSG respectively. Providing a coherent view of all this information that has been synchronized from many different sources is a challenging activity and has been duplicated to various...

239. Reliability Engineering analysis of ATLAS data reprocessing campaigns

Dmytro Karpenko (University of Oslo (NO))

14/10/2013, 13:52

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

During three years of LHC data taking, the ATLAS collaboration completed three petascale data reprocessing campaigns on the Grid, with up to 2 PB of data being reprocessed every year. In reprocessing on the Grid, failures can occur for a variety of reasons, while Grid heterogeneity makes failures hard to diagnose and repair quickly. As a result, Big Data processing on the Grid must tolerate a...

313. Evaluating Google Compute Engine with PROOF

Gerardo Ganis (CERN)

14/10/2013, 14:15

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The advent of private and commercial cloud platforms has opened the question of evaluating the cost-effectiveness of such solution for computing in High Energy Physics . Google Compute Engine (GCE) is a IaaS product launched by Google as an experimental platform during 2012 and now open to the public market. In this contribution we present the results of a set of CPU-intensive and...

394. User Centric Job Monitoring – a redesign and novel approach in the STAR experiment

Dr Jerome LAURET (BROOKHAVEN NATIONAL LABORATORY)

14/10/2013, 14:36

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

User Centric Monitoring (or UCM) has been a long awaited feature in STAR, whereas programs, workflows and system “events” could be logged, broadcast and later analyzed. UCM allows to collect and filter available job monitoring information from various resources and present it to users in a user-centric view rather than and administrative-centric point of view. The first attempt and...

86. Commissioning the CERN IT Agile Infrastructure with experiment workloads

Ramon Medrano Llamas (CERN)

14/10/2013, 15:45

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

In order to ease the management of their infrastructure, most of the WLCG sites are adopting cloud based strategies. In the case of CERN, the Tier 0 of the WLCG, is completely restructuring the resource and configuration management of their computing center under the codename Agile Infrastructure. Its goal is to manage 15,000 Virtual Machines by means of an OpenStack middleware in order to...

109. Opportunistic Computing only knocks once: Processing at SDSC

Ian Fisk (Fermi National Accelerator Lab. (US)), Jacob Thomas Linacre (Fermi National Accelerator Lab. (US))

14/10/2013, 16:07

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

During Spring 2013, CMS processed 1 Billion RAW data events at the San Diego Super Computing Center (SDSC) that was nearly the size of half the CMS dedicated Tier-1 processing resources. This facility has none of the permanent CMS services, service level agreements, or support normally associated with a Tier-1, and was assembled with a few weeks notice to process only a few workflows. The size...

37. Evolution of interactive Analysis Facilities: from NAF to NAF 2.0

Dr Friederike Nowak (DESY)

14/10/2013, 16:29

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

In 2007, the National Analysis Facility (NAF) was set up within the framework of the Helmholtz Alliance "Physics at the Terascale", and is located at DESY. Its purpose was the provision of an analysis infrastructure for up-to-date research in Germany, complementing the Grid by offering a interactive access to the data. It has been well received within the physics community, and has proven to...

481. Implementation of grid Tier 2 and Tier 3 facilities on a Distributed OpenStack Cloud

Dr Antonio Limosani (University of Melbourne (AU))

14/10/2013, 16:51

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Australian Government is making a $AUD 100 million investment in Compute and Storage for the academic community. The Compute facilities are provided in the form of 24,000 CPU cores located at 8 nodes around Australia in a distributed virtualized Infrastructure as a Service facility based on OpenStack. The storage will eventually consist of over 100 petabytes located at 6 nodes. All...

269. Automating usability of ATLAS Distributed Computing resources

Dr Salvatore Tupputi (Universita e INFN (IT))

14/10/2013, 17:25

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions which improve the reliability of the system. In this perspective a crucial case is the automatic exclusion/recovery of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities. It is challenging to...

191. Direct exploitation of a top500 supercomputer in the analysis of CMS data.

Iban Jose Cabrillo Bartolome (Universidad de Cantabria (ES))

14/10/2013, 17:47

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Altamira supercomputer at the Institute of Physics of Cantabria (IFCA) entered in operation in summer 2012. Its last generation FDR Infiniband network used for message passing in parallel jobs, also supports the connection to General Parallel File System (GPFS) servers, enabling an efficient processing of multiple data demanding jobs at the same time. Sharing a common GPFS system with...

31. Integration of Cloud resources in the LHCb Distributed Computing

Mario Ubeda Garcia (CERN), Victor Mendez Munoz (PIC)

15/10/2013, 13:30

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using Dirac and its LHCb-specific extension LHCbDirac as an interware for its Distributed Computing. So far it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of Dirac (VMDIRAC) extends it to the integration of Cloud computing...

60. Virtualised data production infrastructure for NA61/SHINE based on CernVM

Dag Larsen (University of Silesia (PL))

15/10/2013, 13:52

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

Currently, the NA61/SHINE data production is performed on the CERN shared batch system, an approach inherited from its predecessor NA49. New data productions are initiated by manually submitting jobs to the batch system. An effort is now under way to migrate the data production to an automatic system, on top of a fully virtualised platform based on CernVM. There are several motivations for...

93. Usage of the CMS Higher Level Trigger Farm as a Cloud Resource

Dr David Colling (Imperial College Sci., Tech. & Med. (GB))

15/10/2013, 14:14

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Higher Level Trigger (HLT) farm in CMS is a more than ten thousand core processor farm that is heavily used during data acquisition and largely unused when the detector is off. In this presentation we will cover the work done in CMS to utilize this large processing resource with cloud resource provisioning techniques. This resource when configured with Open Stack and Agile Infrastructure...

147. ArbyTrary, a cloud-based service for low-energy spectroscopy

Dr Dario Menasce (INFN Milano-Bicocca)

15/10/2013, 14:36

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

Radiation detectors usually require complex calibration procedures in order to provide reliable activity measurements. The Milano-Bicocca group has developed, over the years, a complex simulation tool, based on GEANT4, that provide the functionality required to compute the correction factors necessary for such calibrations in a broad range of use-cases, considering various radioactive source...

164. Dynamic VM provisioning for Torque in a cloud environment

Lucien Boland (University of Melbourne)

15/10/2013, 15:45

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Nectar national research cloud provides compute resources to Australian researchers using OpenStack. CoEPP, a WLCG Tier2 member, wants to use Nectar’s cloud resources for Tier 2 and Tier 3 processing for ATLAS and other experiments including Belle, as well as theoretical computation. CoEPP would prefer to use the Torque job management system in the cloud because they have extensive...

185. CernVM Online and Cloud Gateway: a uniform interface for CernVM contextualization and deployment

Georgios Lestaris (CERN)

15/10/2013, 16:07

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

In a virtualized environment, contextualization is the process of configuring a VM instance for the needs of various deployment use cases. Contextualization in CernVM can be done by passing a handwritten context to the “user data” field of cloud APIs, when running CernVM on the cloud, or by using CernVM web interface when running the VM locally. CernVM online is a publicly accessible web...

213. Micro-CernVM: Slashing the Cost of Building and Deploying Virtual Machines

Jakob Blomer (CERN)

15/10/2013, 16:29

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The traditional virtual machine building and and deployment process is centered around the virtual machine hard disk image. The packages comprising the VM operating system are carefully selected, hard disk images are built for a variety of different hypervisors, and images have to be distributed and decompressed in order to instantiate a virtual machine. Within the HEP community, the CernVM...

308. PROOF as a Service on the Cloud: a Virtual Analysis Facility based on the CernVM ecosystem

Dario Berzano (CERN)

15/10/2013, 16:51

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

PROOF, the Parallel ROOT Facility, is a ROOT-based framework which enables interactive parallelism for event-based tasks on a cluster of computing nodes. Although PROOF can be used simply from within a ROOT session with no additional requirements, deploying and configuring a PROOF cluster used to be not as straightforward. Recently great efforts have been spent to make the provisioning of...

277. OASIS: a data and software distribution service for Open Science Grid

Dr Jose Caballero Bejar (Brookhaven National Laboratory (US))

15/10/2013, 17:25

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Open Science Grid (OSG) encourages the concept of software portability: a user's scientific application should be able to run in as many operating system environments as possible. This is typically accomplished by compiling the software into a single static binary, or distributing any dependencies in an archive downloaded by each job. However, the concept of portability runs against the...

448. Using the CVMFS for Distributing Data Analysis Applications for the Fermilab Intensity Frontier

Andrew Norman (Fermilab)

15/10/2013, 17:47

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Cern Virtual File System (CVMFS) provides a technology for efficiently distributing code and application files to large and varied collections of computing resources. The CVMFS model and infrastructure has been used to provide a new, scalable solution to the previously difficult task of application and code distribution for grid computing. At Fermilab, a new CVMFS based application...

76. Cloud Bursting with Glideinwms: Means to satisfy ever increasing computing needs for Scientific Workflows

Mr Igor Sfiligoi (University of California San Diego)

17/10/2013, 11:00

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using the...

119. Running jobs in the Vacuum

Andrew McNab (University of Manchester (GB))

17/10/2013, 11:22

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

We present a model for the operation of computing nodes at a site using virtual machines, in which the virtual machines (VMs) are created and contextualised for virtual organisations (VOs) by the site itself. For the VO, these virtual machines appear to be produced spontaneously "in the vacuum" rather than in response to requests by the VO. This model takes advantage of the pilot...

474. Integrating multiple scientific computing needs via a Private Cloud Infrastructure

Stefano Bagnasco (Universita e INFN (IT))

17/10/2013, 11:45

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud infrastructure eases the management and maintenance of such heterogeneous applications (such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and...

98. The CMS openstack, opportunistic, overlay, online-cluster Cloud (CMSooooCloud)

Dr Jose Antonio Coarasa Perez (CERN)

17/10/2013, 12:06

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The CMS online cluster consists of more than 3000 computers. It has been exclusively used for the Data Acquisition of the CMS experiment at CERN, archiving around 20Tbytes of data per day. An openstack cloud layer has been deployed on part of the cluster (totalling more than 13000 cores) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an...

16. BESIII physical analysis on hadoop platform

Dr Gongxing Sun (INSTITUE OF HIGH ENERGY PHYSICS)

17/10/2013, 13:30

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

This paper brings the idea of MapReduce parallel processing to BESIII physical analysis, gives a new data analysis system structure based on HADOOP framework; Optimizes the process of data processing, by establish an event level metadata(TAG) database and do event pre-selection based on TAGs, significantly reduce the number of events that need to do further analysis by 2-3 classes, which...

106. Processing of the WLCG monitoring data using NoSQL.

Dr Edward Karavakis (CERN)

17/10/2013, 13:52

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

The Worldwide LCG Computing Grid (WLCG) today includes more than 170 computing centres where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring the computing activities of the LHC experiments, over such a huge heterogeneous infrastructure, is extremely demanding in terms of computation , performance and reliability. Furthermore,...

401. Running a typical ROOT HEP analysis on Hadoop/MapReduce

Mr Stefano Alberto Russo (Universita degli Studi di Udine (IT))

17/10/2013, 14:14

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

Hadoop/MapReduce is a very common and widely supported distributed computing framework. It consists in a scalable programming model named MapReduce, and a locality-aware distributed file system (HDFS). Its main feature is to implement data locality: through the fusion of computing and storage resources and thanks to the locality-awareness of HDFS, the computation can be scheduled on the nodes...

333. Testing SLURM open source batch system for a Tier1/Tier2 HEP computing facility

Giacinto Donvito (Universita e INFN (IT))

17/10/2013, 14:38

Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization

Oral presentation to parallel session

In this work the testing activities that were carried on to verify if the SLURM batch system could be used as the production batch system of a typical Tier1/Tier2 HEP computing center are shown. SLURM (Simple Linux Utility for Resource Management) is an Open Source batch system developed mainly by the Lawrence Livermore National Laboratory, SchedMD, Linux NetworX, Hewlett-Packard, and Groupe...

Building timetable...