Alessandro Di Girolamo
(CERN)
14/10/2013, 13:30
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The WLCG information system is just one of the many information sources that are required to populate a VO configuration database. Other sources include central portals such as the GOCDB and the OIM from EGI and OSG respectively. Providing a coherent view of all this information that has been synchronized from many different sources is a challenging activity and has been duplicated to various...
Dmytro Karpenko
(University of Oslo (NO))
14/10/2013, 13:52
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
During three years of LHC data taking, the ATLAS collaboration completed three petascale data reprocessing campaigns on the Grid, with up to 2 PB of data being reprocessed every year. In reprocessing on the Grid, failures can occur for a variety of reasons, while Grid heterogeneity makes failures hard to diagnose and repair quickly. As a result, Big Data processing on the Grid must tolerate a...
Gerardo Ganis
(CERN)
14/10/2013, 14:15
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The advent of private and commercial cloud platforms has opened the question of evaluating the cost-effectiveness of such solution for computing in High Energy Physics .
Google Compute Engine (GCE) is a IaaS product launched by Google as an experimental platform during 2012 and now open to the public market.
In this contribution we present the results of a set of CPU-intensive and...
Dr
Jerome LAURET
(BROOKHAVEN NATIONAL LABORATORY)
14/10/2013, 14:36
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
User Centric Monitoring (or UCM) has been a long awaited feature in STAR, whereas programs, workflows and system โeventsโ could be logged, broadcast and later analyzed. UCM allows to collect and filter available job monitoring information from various resources and present it to users in a user-centric view rather than and administrative-centric point of view. The first attempt and...
Ramon Medrano Llamas
(CERN)
14/10/2013, 15:45
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
In order to ease the management of their infrastructure, most of the WLCG sites are adopting cloud based strategies. In the case of CERN, the Tier 0 of the WLCG, is completely restructuring the resource and configuration management of their computing center under the codename Agile Infrastructure. Its goal is to manage 15,000 Virtual Machines by means of an OpenStack middleware in order to...
Ian Fisk
(Fermi National Accelerator Lab. (US)),
Jacob Thomas Linacre
(Fermi National Accelerator Lab. (US))
14/10/2013, 16:07
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
During Spring 2013, CMS processed 1 Billion RAW data events at the San Diego Super Computing Center (SDSC) that was nearly the size of half the CMS dedicated Tier-1 processing resources. This facility has none of the permanent CMS services, service level agreements, or support normally associated with a Tier-1, and was assembled with a few weeks notice to process only a few workflows. The size...
Dr
Friederike Nowak
(DESY)
14/10/2013, 16:29
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
In 2007, the National Analysis Facility (NAF) was set up within the framework of the Helmholtz Alliance "Physics at the Terascale", and is located at DESY. Its purpose was the provision of an analysis infrastructure for up-to-date research in Germany, complementing the Grid by offering a interactive access to the data. It has been well received within the physics community, and has proven to...
Dr
Antonio Limosani
(University of Melbourne (AU))
14/10/2013, 16:51
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Australian Government is making a $AUD 100 million investment in
Compute and Storage for the academic community. The Compute facilities
are provided in the form of 24,000 CPU cores located at 8 nodes around
Australia in a distributed virtualized Infrastructure as a Service
facility based on OpenStack. The storage will eventually consist of
over 100 petabytes located at 6 nodes. All...
Dr
Salvatore Tupputi
(Universita e INFN (IT))
14/10/2013, 17:25
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions which improve the reliability of the system. In this perspective a crucial case is the automatic exclusion/recovery of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities.
It is challenging to...
Iban Jose Cabrillo Bartolome
(Universidad de Cantabria (ES))
14/10/2013, 17:47
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Altamira supercomputer at the Institute of Physics of Cantabria (IFCA) entered in operation in summer 2012.
Its last generation FDR Infiniband network used for message passing in parallel jobs, also supports the connection to General Parallel File System (GPFS) servers, enabling an efficient processing of multiple data demanding jobs at the same time.
Sharing a common GPFS system with...
Mario Ubeda Garcia
(CERN),
Victor Mendez Munoz
(PIC)
15/10/2013, 13:30
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using Dirac and its LHCb-specific extension LHCbDirac as an interware for its Distributed Computing. So far it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of Dirac (VMDIRAC) extends it to the integration of Cloud computing...
Dag Larsen
(University of Silesia (PL))
15/10/2013, 13:52
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
Currently, the NA61/SHINE data production is performed on the CERN shared batch system, an approach inherited from its predecessor NA49. New data productions are initiated by manually submitting jobs to the batch system. An effort is now under way to migrate the data production to an automatic system, on top of a fully virtualised platform based on CernVM. There are several motivations for...
Dr
David Colling
(Imperial College Sci., Tech. & Med. (GB))
15/10/2013, 14:14
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Higher Level Trigger (HLT) farm in CMS is a more than ten thousand core processor farm that is heavily used during data acquisition and largely unused when the detector is off. In this presentation we will cover the work done in CMS to utilize this large processing resource with cloud resource provisioning techniques. This resource when configured with Open Stack and Agile Infrastructure...
Dr
Dario Menasce
(INFN Milano-Bicocca)
15/10/2013, 14:36
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
Radiation detectors usually require complex calibration procedures in order to provide reliable activity measurements. The Milano-Bicocca group has developed, over the years, a complex simulation tool, based on GEANT4, that provide the functionality required to compute the correction factors necessary for such calibrations in a broad range of use-cases, considering various radioactive source...
Lucien Boland
(University of Melbourne)
15/10/2013, 15:45
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Nectar national research cloud provides compute resources to Australian researchers using OpenStack. CoEPP, a WLCG Tier2 member, wants to use Nectarโs cloud resources for Tier 2 and Tier 3 processing for ATLAS and other experiments including Belle, as well as theoretical computation. CoEPP would prefer to use the Torque job management system in the cloud because they have extensive...
Georgios Lestaris
(CERN)
15/10/2013, 16:07
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
In a virtualized environment, contextualization is the process of configuring a VM instance for the needs of various deployment use cases. Contextualization in CernVM can be done by passing a handwritten context to the โuser dataโ field of cloud APIs, when running CernVM on the cloud, or by using CernVM web interface when running the VM locally. CernVM online is a publicly accessible web...
Jakob Blomer
(CERN)
15/10/2013, 16:29
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The traditional virtual machine building and and deployment process is centered around the virtual machine hard disk image. The packages comprising the VM operating system are carefully selected, hard disk images are built for a variety of different hypervisors, and images have to be distributed and decompressed in order to instantiate a virtual machine. Within the HEP community, the CernVM...
Dario Berzano
(CERN)
15/10/2013, 16:51
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
PROOF, the Parallel ROOT Facility, is a ROOT-based framework which enables interactive parallelism for event-based tasks on a cluster of computing nodes.
Although PROOF can be used simply from within a ROOT session with no additional requirements, deploying and configuring a PROOF cluster used to be not as straightforward. Recently great efforts have been spent to make the provisioning of...
Dr
Jose Caballero Bejar
(Brookhaven National Laboratory (US))
15/10/2013, 17:25
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Open Science Grid (OSG) encourages the concept of software portability: a user's scientific application should be able to run in as many operating system environments as possible. This is typically accomplished by compiling the software into a single static binary, or distributing any dependencies in an archive downloaded by each job. However, the concept of portability runs against the...
448.
Using the CVMFS for Distributing Data Analysis Applications for the Fermilab Intensity Frontier
Andrew Norman
(Fermilab)
15/10/2013, 17:47
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Cern Virtual File System (CVMFS) provides a technology for efficiently distributing code and application files to large and varied collections of computing resources. The CVMFS model and infrastructure has been used to provide a new, scalable solution to the previously difficult task of application and code distribution for grid computing.
At Fermilab, a new CVMFS based application...
Mr
Igor Sfiligoi
(University of California San Diego)
17/10/2013, 11:00
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using the...
Andrew McNab
(University of Manchester (GB))
17/10/2013, 11:22
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
We present a model for the operation of computing
nodes at a site using virtual machines, in which the
virtual machines (VMs) are created and contextualised
for virtual organisations (VOs) by the site itself. For
the VO, these virtual machines appear to be produced
spontaneously "in the vacuum" rather than in response
to requests by the VO. This model takes advantage of
the pilot...
Stefano Bagnasco
(Universita e INFN (IT))
17/10/2013, 11:45
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
In a typical scientific computing centre, diverse applications coexist and
share a single physical infrastructure. An underlying Private Cloud
infrastructure eases the management and maintenance of such heterogeneous
applications (such as multipurpose or application-specific batch farms,
Grid sites catering to different communities, parallel interactive data
analysis facilities and...
Dr
Jose Antonio Coarasa Perez
(CERN)
17/10/2013, 12:06
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The CMS online cluster consists of more than 3000 computers. It has been exclusively used for the Data Acquisition of the CMS experiment at CERN, archiving around 20Tbytes of data per day.
An openstack cloud layer has been deployed on part of the cluster (totalling more than 13000 cores) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an...
Dr
Gongxing Sun
(INSTITUE OF HIGH ENERGY PHYSICS)
17/10/2013, 13:30
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
This paper brings the idea of MapReduce parallel processing to BESIII physical analysis, gives a new data analysis system structure based on HADOOP framework; Optimizes the process of data processing, by establish an event level metadata(TAG) database and do event pre-selection based on TAGs, significantly reduce the number of events that need to do further analysis by 2-3 classes, which...
Dr
Edward Karavakis
(CERN)
17/10/2013, 13:52
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
The Worldwide LCG Computing Grid (WLCG) today includes more than 170 computing centres where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring the computing activities of the LHC experiments, over such a huge heterogeneous infrastructure, is extremely demanding in terms of computation , performance and reliability. Furthermore,...
Mr
Stefano Alberto Russo
(Universita degli Studi di Udine (IT))
17/10/2013, 14:14
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
Hadoop/MapReduce is a very common and widely supported distributed computing framework. It consists in a scalable programming model named MapReduce, and a locality-aware distributed file system (HDFS). Its main feature is to implement data locality: through the fusion of computing and storage resources and thanks to the locality-awareness of HDFS, the computation can be scheduled on the nodes...
Giacinto Donvito
(Universita e INFN (IT))
17/10/2013, 14:38
Distributed Processing and Data Handling A: Infrastructure, Sites, and Virtualization
Oral presentation to parallel session
In this work the testing activities that were carried on to verify if the SLURM batch system could be used as the production batch system of a typical Tier1/Tier2 HEP computing center are shown. SLURM (Simple Linux Utility for Resource Management) is an Open Source batch system developed mainly by the Lawrence Livermore National Laboratory, SchedMD, Linux NetworX, Hewlett-Packard, and Groupe...