-
Alessandro Di Girolamo (CERN)14/10/2013, 13:30Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe WLCG information system is just one of the many information sources that are required to populate a VO configuration database. Other sources include central portals such as the GOCDB and the OIM from EGI and OSG respectively. Providing a coherent view of all this information that has been synchronized from many different sources is a challenging activity and has been duplicated to various...Go to contribution page
-
Dmytro Karpenko (University of Oslo (NO))14/10/2013, 13:52Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionDuring three years of LHC data taking, the ATLAS collaboration completed three petascale data reprocessing campaigns on the Grid, with up to 2 PB of data being reprocessed every year. In reprocessing on the Grid, failures can occur for a variety of reasons, while Grid heterogeneity makes failures hard to diagnose and repair quickly. As a result, Big Data processing on the Grid must tolerate a...Go to contribution page
-
Gerardo Ganis (CERN)14/10/2013, 14:15Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe advent of private and commercial cloud platforms has opened the question of evaluating the cost-effectiveness of such solution for computing in High Energy Physics . Google Compute Engine (GCE) is a IaaS product launched by Google as an experimental platform during 2012 and now open to the public market. In this contribution we present the results of a set of CPU-intensive and...Go to contribution page
-
Dr Jerome LAURET (BROOKHAVEN NATIONAL LABORATORY)14/10/2013, 14:36Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionUser Centric Monitoring (or UCM) has been a long awaited feature in STAR, whereas programs, workflows and system โeventsโ could be logged, broadcast and later analyzed. UCM allows to collect and filter available job monitoring information from various resources and present it to users in a user-centric view rather than and administrative-centric point of view. The first attempt and...Go to contribution page
-
Ramon Medrano Llamas (CERN)14/10/2013, 15:45Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionIn order to ease the management of their infrastructure, most of the WLCG sites are adopting cloud based strategies. In the case of CERN, the Tier 0 of the WLCG, is completely restructuring the resource and configuration management of their computing center under the codename Agile Infrastructure. Its goal is to manage 15,000 Virtual Machines by means of an OpenStack middleware in order to...Go to contribution page
-
Ian Fisk (Fermi National Accelerator Lab. (US)), Jacob Thomas Linacre (Fermi National Accelerator Lab. (US))14/10/2013, 16:07Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionDuring Spring 2013, CMS processed 1 Billion RAW data events at the San Diego Super Computing Center (SDSC) that was nearly the size of half the CMS dedicated Tier-1 processing resources. This facility has none of the permanent CMS services, service level agreements, or support normally associated with a Tier-1, and was assembled with a few weeks notice to process only a few workflows. The size...Go to contribution page
-
Dr Friederike Nowak (DESY)14/10/2013, 16:29Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionIn 2007, the National Analysis Facility (NAF) was set up within the framework of the Helmholtz Alliance "Physics at the Terascale", and is located at DESY. Its purpose was the provision of an analysis infrastructure for up-to-date research in Germany, complementing the Grid by offering a interactive access to the data. It has been well received within the physics community, and has proven to...Go to contribution page
-
Dr Antonio Limosani (University of Melbourne (AU))14/10/2013, 16:51Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Australian Government is making a $AUD 100 million investment in Compute and Storage for the academic community. The Compute facilities are provided in the form of 24,000 CPU cores located at 8 nodes around Australia in a distributed virtualized Infrastructure as a Service facility based on OpenStack. The storage will eventually consist of over 100 petabytes located at 6 nodes. All...Go to contribution page
-
Dr Salvatore Tupputi (Universita e INFN (IT))14/10/2013, 17:25Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions which improve the reliability of the system. In this perspective a crucial case is the automatic exclusion/recovery of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities. It is challenging to...Go to contribution page
-
Iban Jose Cabrillo Bartolome (Universidad de Cantabria (ES))14/10/2013, 17:47Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Altamira supercomputer at the Institute of Physics of Cantabria (IFCA) entered in operation in summer 2012. Its last generation FDR Infiniband network used for message passing in parallel jobs, also supports the connection to General Parallel File System (GPFS) servers, enabling an efficient processing of multiple data demanding jobs at the same time. Sharing a common GPFS system with...Go to contribution page
-
Mario Ubeda Garcia (CERN), Victor Mendez Munoz (PIC)15/10/2013, 13:30Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThis contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using Dirac and its LHCb-specific extension LHCbDirac as an interware for its Distributed Computing. So far it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of Dirac (VMDIRAC) extends it to the integration of Cloud computing...Go to contribution page
-
Dag Larsen (University of Silesia (PL))15/10/2013, 13:52Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionCurrently, the NA61/SHINE data production is performed on the CERN shared batch system, an approach inherited from its predecessor NA49. New data productions are initiated by manually submitting jobs to the batch system. An effort is now under way to migrate the data production to an automatic system, on top of a fully virtualised platform based on CernVM. There are several motivations for...Go to contribution page
-
Dr David Colling (Imperial College Sci., Tech. & Med. (GB))15/10/2013, 14:14Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Higher Level Trigger (HLT) farm in CMS is a more than ten thousand core processor farm that is heavily used during data acquisition and largely unused when the detector is off. In this presentation we will cover the work done in CMS to utilize this large processing resource with cloud resource provisioning techniques. This resource when configured with Open Stack and Agile Infrastructure...Go to contribution page
-
Dr Dario Menasce (INFN Milano-Bicocca)15/10/2013, 14:36Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionRadiation detectors usually require complex calibration procedures in order to provide reliable activity measurements. The Milano-Bicocca group has developed, over the years, a complex simulation tool, based on GEANT4, that provide the functionality required to compute the correction factors necessary for such calibrations in a broad range of use-cases, considering various radioactive source...Go to contribution page
-
Lucien Boland (University of Melbourne)15/10/2013, 15:45Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Nectar national research cloud provides compute resources to Australian researchers using OpenStack. CoEPP, a WLCG Tier2 member, wants to use Nectarโs cloud resources for Tier 2 and Tier 3 processing for ATLAS and other experiments including Belle, as well as theoretical computation. CoEPP would prefer to use the Torque job management system in the cloud because they have extensive...Go to contribution page
-
Georgios Lestaris (CERN)15/10/2013, 16:07Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionIn a virtualized environment, contextualization is the process of configuring a VM instance for the needs of various deployment use cases. Contextualization in CernVM can be done by passing a handwritten context to the โuser dataโ field of cloud APIs, when running CernVM on the cloud, or by using CernVM web interface when running the VM locally. CernVM online is a publicly accessible web...Go to contribution page
-
Jakob Blomer (CERN)15/10/2013, 16:29Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe traditional virtual machine building and and deployment process is centered around the virtual machine hard disk image. The packages comprising the VM operating system are carefully selected, hard disk images are built for a variety of different hypervisors, and images have to be distributed and decompressed in order to instantiate a virtual machine. Within the HEP community, the CernVM...Go to contribution page
-
Dario Berzano (CERN)15/10/2013, 16:51Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionPROOF, the Parallel ROOT Facility, is a ROOT-based framework which enables interactive parallelism for event-based tasks on a cluster of computing nodes. Although PROOF can be used simply from within a ROOT session with no additional requirements, deploying and configuring a PROOF cluster used to be not as straightforward. Recently great efforts have been spent to make the provisioning of...Go to contribution page
-
Dr Jose Caballero Bejar (Brookhaven National Laboratory (US))15/10/2013, 17:25Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Open Science Grid (OSG) encourages the concept of software portability: a user's scientific application should be able to run in as many operating system environments as possible. This is typically accomplished by compiling the software into a single static binary, or distributing any dependencies in an archive downloaded by each job. However, the concept of portability runs against the...Go to contribution page
-
448. Using the CVMFS for Distributing Data Analysis Applications for the Fermilab Intensity FrontierAndrew Norman (Fermilab)15/10/2013, 17:47Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Cern Virtual File System (CVMFS) provides a technology for efficiently distributing code and application files to large and varied collections of computing resources. The CVMFS model and infrastructure has been used to provide a new, scalable solution to the previously difficult task of application and code distribution for grid computing. At Fermilab, a new CVMFS based application...Go to contribution page
-
Mr Igor Sfiligoi (University of California San Diego)17/10/2013, 11:00Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionScientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using the...Go to contribution page
-
Andrew McNab (University of Manchester (GB))17/10/2013, 11:22Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionWe present a model for the operation of computing nodes at a site using virtual machines, in which the virtual machines (VMs) are created and contextualised for virtual organisations (VOs) by the site itself. For the VO, these virtual machines appear to be produced spontaneously "in the vacuum" rather than in response to requests by the VO. This model takes advantage of the pilot...Go to contribution page
-
Stefano Bagnasco (Universita e INFN (IT))17/10/2013, 11:45Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionIn a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud infrastructure eases the management and maintenance of such heterogeneous applications (such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and...Go to contribution page
-
Dr Jose Antonio Coarasa Perez (CERN)17/10/2013, 12:06Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe CMS online cluster consists of more than 3000 computers. It has been exclusively used for the Data Acquisition of the CMS experiment at CERN, archiving around 20Tbytes of data per day. An openstack cloud layer has been deployed on part of the cluster (totalling more than 13000 cores) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an...Go to contribution page
-
Dr Gongxing Sun (INSTITUE OF HIGH ENERGY PHYSICS)17/10/2013, 13:30Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThis paper brings the idea of MapReduce parallel processing to BESIII physical analysis, gives a new data analysis system structure based on HADOOP framework; Optimizes the process of data processing, by establish an event level metadata(TAG) database and do event pre-selection based on TAGs, significantly reduce the number of events that need to do further analysis by 2-3 classes, which...Go to contribution page
-
Dr Edward Karavakis (CERN)17/10/2013, 13:52Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionThe Worldwide LCG Computing Grid (WLCG) today includes more than 170 computing centres where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring the computing activities of the LHC experiments, over such a huge heterogeneous infrastructure, is extremely demanding in terms of computation , performance and reliability. Furthermore,...Go to contribution page
-
Mr Stefano Alberto Russo (Universita degli Studi di Udine (IT))17/10/2013, 14:14Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionHadoop/MapReduce is a very common and widely supported distributed computing framework. It consists in a scalable programming model named MapReduce, and a locality-aware distributed file system (HDFS). Its main feature is to implement data locality: through the fusion of computing and storage resources and thanks to the locality-awareness of HDFS, the computation can be scheduled on the nodes...Go to contribution page
-
Giacinto Donvito (Universita e INFN (IT))17/10/2013, 14:38Distributed Processing and Data Handling A: Infrastructure, Sites, and VirtualizationOral presentation to parallel sessionIn this work the testing activities that were carried on to verify if the SLURM batch system could be used as the production batch system of a typical Tier1/Tier2 HEP computing center are shown. SLURM (Simple Linux Utility for Resource Management) is an Open Source batch system developed mainly by the Lawrence Livermore National Laboratory, SchedMD, Linux NetworX, Hewlett-Packard, and Groupe...Go to contribution page
Choose timezone
Your profile timezone: