Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015)

13–17 Apr 2015

OIST

Asia/Tokyo timezone

Session

Track 4 Session

13 Apr 2015, 14:00

OIST

OIST

1919-1 Tancha, Onna-son, Kunigami-gun Okinawa, Japan 904-0495

Track 4 Session: #1 (Middleware)

Oliver Keeble (CERN)

Track 4 Session: #2 (Framework)

Vincent Garonne (University of Oslo (NO))

Track 4 Session: #3 (Middleware)

Marco Clemencic (CERN)

Track 4 Session: #4 (Application)

Oliver Gutsche (Fermi National Accelerator Lab. (US))

Track 4 Session: #5 (Software)

Andreas Heiss (KIT - Karlsruhe Institute of Technology (DE))

Track 4 Session: #6 (Application)

Tony Wildish (Princeton University (US))

There are no materials yet.

113. Pilots 2.0: DIRAC pilots for all the skies

Federico Stagni (CERN)

13/04/2015, 14:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

In the last few years, new types of computing infrastructures, such as IAAS (Infrastructure as a Service) and IAAC (Infrastructure as a Client), gained popularity. New resource may come as part of pledged resources, while others are in the form of opportunistic ones. Most of these new infrastructures are based on virtualization techniques, others don't. Meanwhile, some concepts, such as...

144. The Future of PanDA in ATLAS Distributed Computing

Tadashi Maeno (Brookhaven National Laboratory (US))

13/04/2015, 14:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyze the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed...

409. Evolution of CMS workload management towards multicore job support

Dr Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)

13/04/2015, 14:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The successful exploitation of the multicore processor architectures available at the computing sites is a key element of the LHC distributed computing system in the coming era of the LHC Run 2. High-pileup complex-collision events represent a challenge for the traditional sequential programming in terms of memory and processing time budget. The CMS data production and processing framework has...

96. A history-based estimation for LHCb job requirements

Nathalie Rauschmayr (CERN)

13/04/2015, 14:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The main goal of a Workload Management System (WMS) is to find and allocate resources for the jobs it is handling. The more and more accurate information the WMS receives about the jobs, the easier it will be to accomplish its task, which will directly translate into a better utilization of resources. Traditionally, the information associated with each job, like expected runtime or memory...

289. Using the glideinWMS System as a Common Resource Provisioning Layer in CMS

James Letts (Univ. of California San Diego (US))

13/04/2015, 15:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

CMS will require access to more than 125k processor cores for the beginning of Run2 in 2015 to carry out its ambitious physics program with more and higher complexity events. During Run1 these resources were predominantly provided by a mix of grid sites and local batch resources. During the long shut down cloud infrastructures, diverse opportunistic resources and HPC supercomputing centers...

205. The ATLAS Data Management system - Rucio: commissioning, migration and operational experiences

Vincent Garonne (CERN)

13/04/2015, 15:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

For more than 8 years, the Distributed Data Management (DDM) system of ATLAS called DQ2 has been able to demonstrate very large scale data management capabilities with more than 600M files, 160 petabytes spread worldwide across 130 sites, and accesses from 1,000 active users. However, the system does not scale for LHC run2 and a new DDM system called Rucio has been developed to be DQ2's...

207. Resource control in ATLAS distributed data management: Rucio Accounting and Quotas

Martin Barisits (CERN)

13/04/2015, 15:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The ATLAS Distributed Data Management system stores more than 160PB of physics data across more than 130 sites globally. Rucio, the next-generation data management system of ATLAS has been introduced to cope with the anticipated workload of the coming decade. The previous data management system DQ2 pursued a rather simplistic approach for resource management, but with the increased data volume...

225. AsyncStageOut: Distributed user data management for CMS Analysis

Dr Tony Wildish (Princeton University (US))

13/04/2015, 15:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that data movement was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day...

496. IceProd2: A Next Generation Data Analysis Framework for the IceCube Neutrino Observatory

David Schultz (University of Wisconsin-Madison)

13/04/2015, 16:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

We describe the overall structure and new features of the second generation of IceProd, a data processing and management framework. IceProd was developed by the IceCube Neutrino Observatory for processing of Monte Carlo simulations and detector data, and has been a key component of the IceCube offline computing infrastructure since it was first deployed in 2006. It runs fully in user space as...

329. Belle II production system

Hideki Miyake (KEK)

13/04/2015, 16:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

In Belle II experiment a large amount of physics data will be continuously taken and the production rate is equivalent to LHC experiments. Considerable resources of computing, storage, and network, are necessary to handle not only the taken data but also substantial simulated data. Therefore Belle II exploits distributed computing system based on DIRAC interware. DIRAC is a general...

112. Jobs masonry with elastic Grid Jobs

Federico Stagni (CERN)

13/04/2015, 17:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The DIRAC workload management system used by LHCb Distributed Computing is based on Computing Resource reservation and late binding (also known as pilot job in the case of batch resources) that allows the serial execution of several jobs obtained from a central task queue. CPU resources can usually be reserved for limited duration only (e.g. batch queue time limit) and in order to optimize...

183. The ATLAS Event Service: A new approach to event processing

Dr Torre Wenaus (Brookhaven National Laboratory (US))

13/04/2015, 17:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The ATLAS Event Service (ES) implements a new fine grained approach to HEP event processing, designed to be agile and efficient in exploiting transient, short-lived resources such as HPC hole-filling, spot market commercial clouds, and volunteer computing. Input and output control and data flows, bookkeeping, monitoring, and data storage are all managed at the event level in an implementation...

345. CMS data distributed analysis with CRAB3

Marco Mascheroni (Universita & INFN, Milano-Bicocca (IT))

13/04/2015, 17:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The CMS Remote Analysis Builder (CRAB) provides the service for managing analysis tasks isolating users from the technical details of the distributed Grid infrastructure. Throughout the LHC Run 1, CRAB has been successfully employed by an average 350 distinct users every week executing about 200,000 jobs per day. In order to face the new challenges posed by the LHC Run 2, CRAB has been...

527. Agile Research - Strengthening Reproducibility in Collaborative Data Analysis Projects

Sebastian Neubert (CERN)

13/04/2015, 17:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Reproducibility of results is a fundamental quality of scientific research. However, as data analyses become more and more complex and research is increasingly carried out by larger and larger teams, it becomes a challenge to keep up this standard. The decomposition of complex problems into tasks that can be effectively distributed over a team in a reproducible manner becomes...

479. Multi-VO Support in IHEP’s Distributed Computing Environment

Dr Tian Yan (Institution of High Energy Physics, Chinese Academy of Science)

13/04/2015, 18:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

For Beijing Spectrometer III (BESIII) experiment located at the Institute of High Energy Physics (IHEP), China, the distributed computing environment (DCE) has been setup and been in production status since 2012. The basic framework or middleware is DIRAC (Distributed Infrastructure with Remote Agent Control) with BES-DIRAC extensions. About 2000 CPU cores and 400 TB storage contributed by...

334. The GridPP DIRAC project - DIRAC for non-LHC communities

Janusz Martyniak (Imperial College London)

13/04/2015, 18:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The GridPP consortium in the UK is currently testing a multi-VO DIRAC service aimed at non-LHC VOs. These VOs are typically small (fewer than two hundred members) and generally do not have a dedicated computing support post. The majority of these represent particle physics experiments (e.g. T2K, NA62 and COMET), although the scope of the DIRAC service is not limited to this field. A few VOs...

519. Commissioning HTCondor-CE for the Open Science Grid

Edgar Fajardo Hernandez (Univ. of California San Diego (US))

14/04/2015, 14:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The HTCondor-CE is the next-generation gateway software for the Open Science Grid (OSG). This is responsible for providing a network service which authorizes remote users and provides a resource provisioning service (other well-known gatekeepers include Globus GRAM, CREAM, Arc-CE, and Openstack’s Nova). Based on the venerable HTCondor software, this new CE is simply a highly-specialized...

145. Dynamic Resource Allocation with the ARC Control Tower

Andrej Filipcic (Jozef Stefan Institute (SI))

14/04/2015, 14:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Distributed computing resources available for high-energy physics research are becoming less dedicated to one type of workflow and researchers’ workloads are increasingly exploiting modern computing technologies such as parallelism. The current pilot job management model used by many experiments relies on static dedicated resources and cannot easily adapt to these changes. The model used for...

263. ARC Control Tower: A flexible generic distributed job management framework

Jon Kerr Nilsen (University of Oslo (NO))

14/04/2015, 14:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

While current grid middlewares are quite advanced in terms of connecting jobs to resources, their client tools are generally quite minimal and features for managing large sets of jobs are left to the user to implement. The ARC Control Tower (aCT) is a very flexible job management framework that can be run on anything from a single user’s laptop to a multi-server distributed setup. aCT was...

14. Intrusion Detection in Grid computing by Intelligent Analysis of Jobs Behavior – The LHC ALICE Case

Andres Gomez Ramirez (Johann-Wolfgang-Goethe Univ. (DE))

14/04/2015, 14:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Grid infrastructures allow users flexible on-demand usage of computing resources using an Internet connection. A remarkable example of a Grid in High Energy Physics (HEP) research is used by the ALICE experiment at European Organization for Nuclear Research CERN. Physicists can submit jobs used to process the huge amount of particle collision data produced by the Large Hadron Collider (LHC) at...

191. Virtual Circuits in PhEDEx, an update from the ANSE project

Dr Tony Wildish (Princeton)

14/04/2015, 15:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The ANSE project has been working with the CMS and ATLAS experiments to bring network awareness into their middleware stacks. For CMS, this means enabling control of virtual network circuits in PhEDEx, the CMS data-transfer management system. PhEDEx orchestrates the transfer of data around the CMS experiment to the tune of 1 PB per week spread over about 70 sites. The goal of ANSE is to...

237. Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project

Dr Alexei Klimentov (Brookhaven National Laboratory (US))

14/04/2015, 15:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

A crucial contributor to the success of the massively scaled global computing system that delivers the analysis needs of the LHC experiments is the networking infrastructure upon which the system is built. The experiments have been able to exploit excellent high-bandwidth networking in adapting their computing models for the most efficient utilization of resources. New advanced networking...

333. Multicore job scheduling in the Worldwide LHC Computing Grid

Alessandra Forti (University of Manchester (GB))

14/04/2015, 15:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

After the successful first run of the LHC, data taking will restart in early 2015 with unprecedented experimental conditions leading to increased data volumes and event complexity. In order to process the data generated in such scenario and exploit the multicore architectures of current CPUs, the LHC experiments have developed parallelized software for data reconstruction and simulation. A...

457. Multicore-Aware Data Transfer Middleware (MDTM)

Dr Wenji Wu (Fermi National Accelerator Laboratory)

14/04/2015, 15:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Multicore and manycore have become the norm for scientific computing environments. Multicore/manycore platform architectures provide advanced capabilities and features that can be exploited to enhance data movement performance for large-scale distributed computing environments, such as LHC. However, existing data movement tools do not take full advantage of these capabilities and features....

28. The Careful Puppet Master: Reducing risk and fortifying acceptance testing with Jenkins CI

Mr Jason Alexander Smith (Brookhaven National Laboratory)

14/04/2015, 16:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Using centralized configuration management, including automation tools such as Puppet, can greatly increase provisioning speed and efficiency when configuring new systems or making changes to existing systems, reduce duplication of work, and improve automated processes. However, centralized management also brings with it a level of inherent risk: a single change in just one file can...

204. The ATLAS Software Installation System v2: a highly available system to install and validate Grid and Cloud sites via Panda

Alessandro De Salvo (Universita e INFN, Roma I (IT))

14/04/2015, 16:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The ATLAS Installation System v2 is the evolution of the original system, used since 2003. The original tool has been completely re-designed in terms of database backend and components, adding support for submission to multiple backends, including the original WMS and the new Panda modules. The database engine has been changed from plain MySQL to Galera/Percona and the table structure has been...

36. A Validation System for the Complex Event Processing Directives of the ATLAS Shifter Assistant Tool

Dr Giuseppe Avolio (CERN)

14/04/2015, 17:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Complex Event Processing (CEP) is a methodology that combines data from different sources in order to identify events or patterns that need particular attention. It has gained a lot of momentum in the computing world in the past few years and is used in ATLAS to continuously monitor the behaviour of the data acquisition system, to trigger corrective actions and to guide the experiment’s...

45. Visualization of dCache accounting information with state-of-the-art Data Analysis Tools.

Mr Tigran Mkrtchyan (DESY)

14/04/2015, 17:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Over the previous years, storage providers in scientific infrastructures were facing a significant change in the usage profile of their resources. While in the past, a small number of experiment frameworks were accessing those resources in a coherent manner, now, a large amount of small groups or even individuals request access in a completely chaotic way. Moreover, scientific laboratories...

176. Event-Driven Messaging for Offline Data Quality Monitoring at ATLAS

Peter Onyisi (University of Texas (US))

14/04/2015, 17:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

During LHC Run 1, the information flow through the offline data quality monitoring in ATLAS relied heavily on chains of processes polling each other's outputs for handshaking purposes. This resulted in a fragile architecture with many possible points of failure and an inability to monitor the overall state of the distributed system. We report on the status of a project undertaken during the...

167. An object-oriented approach to generating highly configurable Web interfaces for the ATLAS experiment

Bruno Lange Ramos (Univ. Federal do Rio de Janeiro (BR)), Bruno Lange Ramos (Univ. Federal do Rio de Janeiro (BR))

14/04/2015, 17:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

In order to manage a heterogeneous and worldwide collaboration, the ATLAS experiment developed web systems that range from supporting the process of publishing scientific papers to monitoring equipment radiation levels. These systems are vastly supported by Glance, a technology that was set forward in 2004 to create an abstraction layer on top of different databases; it automatically...

310. Accelerating Debugging In A Highly Distributed Environment

Andrew Hanushevsky (STANFORD LINEAR ACCELERATOR CENTER)

14/04/2015, 18:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

As more experiments move to a federated model of data access the environment becomes highly distributed and decentralized. In many cases this may pose obstacles in quickly resolving site issues; especially given vast time-zone differences. Spurred by ATLAS needs, Release 4 of XRootD incorporates a special mode of access to provide remote debugging capabilities. Essentially, XRootD allows a...

348. Testable physics by design

Dr Maria Grazia Pia (Universita e INFN (IT))

16/04/2015, 09:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Testable physics by design The validation of physics calculations requires the capability to thoroughly test them. The difficulty of exposing parts of the software to adequate testing can be the source of incorrect physics functionality, which in turn may generate hard to identify systematic effects in physics observables produced by the experiments. Starting from real-life examples...

485. First statistical analysis of Geant4 quality software metrics

Elisabetta Ronchieri (INFN)

16/04/2015, 09:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Geant4 is a widespread simulation system of "particles through matter" used in several experimental areas from high energy physics and nuclear experiments to medical studies. Some of its applications may involve critical use cases; therefore they would benefit from an objective assessment of the software quality of Geant4. The issue of maintainability is especially relevant for such a widely...

381. ROOT6: a quest for performance

Danilo Piparo (CERN)

16/04/2015, 09:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The sixth release cycle of ROOT is characterised by a radical modernisation in the core software technologies the tookit relies on: language standard, interpreter, hardware exploitation mechanisms. If on the one hand, the change offered the opportunity of consolidating the existing codebase, in presence of such innovations, maintaing the balance between full backward compatibility and...

411. ROOT 6 and beyond: TObject, C++14 and many cores.

Philippe Canal (Fermi National Accelerator Lab. (US))

16/04/2015, 09:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Following the release of version 6, ROOT has entered a new area of development. It will leverage the industrial strength compiler library shipping in ROOT 6 and its support of the C++11/14 standard, to significantly simplify and harden ROOT's interfaces and to clarify and substantially improve ROOT's support for multi-threaded environments. This talk will also recap the most important new...

478. IgProf profiler support for power efficient computing

Mr Giulio Eulisse (Fermi National Accelerator Lab. (US))

16/04/2015, 10:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

In recent years the size and scale of scientific computing has grown significantly. Computing facilities have grown to the point where energy availability and costs have become important limiting factors for data-center size and density. At the same time, power density limitations in processors themselves are driving interest in more heterogeneous processor architectures. Optimizing...

232. Quantitative transfer monitoring for FTS3

Oliver Keeble (CERN)

16/04/2015, 11:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The overall success of LHC data processing depends heavily on stable, reliable and fast data distribution. The Worldwide LHC Computing Grid (WLCG) relies on the File Transfer Service (FTS) as the data movement middleware for moving sets of files from one site to another. This paper describes the components of FTS3 monitoring infrastructure and how they are built to satisfy the common and...

327. Cernbox + EOS: End-user Storage for Science

Luca Mascetti (CERN)

16/04/2015, 11:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

Cernbox is a cloud synchronisation service for end-users: it allows to sync and share files on all major mobile and desktop platforms (Linux, Windows, MacOSX, Android, iOS) aiming to provide offline availability to any data stored in the CERN EOS infrastructure. The successful beta phase of the service confirmed the high demand in the community for such easily accessible cloud storage...

444. Advances in Distributed High Throughput Computing for the Fabric for Frontier Experiments Project at Fermilab

Parag Mhashilkar (Fermi National Accelerator Laboratory)

16/04/2015, 11:30

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The FabrIc for Frontier Experiments (FIFE) program is an ambitious, major-impact initiative within the Fermilab Scientific Computing Division designed to lead the computing model development for Fermilab experiments and external projects. FIFE is a collaborative effort between physicists and computing professionals to provide computing solutions for experiments of varying scale, needs, and...

537. Achieving production-level use of HEP software at the Argonne Leadership Computing Facility

Tom Uram (urn:Google)

16/04/2015, 11:45

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

HEP’s demand for computing resources has grown beyond the capacity of the Grid, and these demands will accelerate with the higher energy and luminosity planned for Run II. Mira, the ten petaflops supercomputer at the Argonne Leadership Computing Facility, is a potentially significant compute resource for HEP research. Through an award of fifty million hours on Mira, we have delivered millions...

346. The GridPP DIRAC project - Implementation of a multi-VO DIRAC service

Dr Robert Andrew Currie (Imperial College Sci., Tech. & Med. (GB))

16/04/2015, 12:00

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

The DIRAC INTERWARE system was originally developed within the LHCb VO as a common interface to access distributed resources, i.e. grids, clouds and local batch systems. It has been used successfully in this context by the LHCb VO for a number of years. In April 2013 the GridPP consortium in the UK decided to offer a DIRAC service to a number of small VOs. The majority of these had been...

463. Replacing the Engines without Stopping The Train; How A Production Data Handling System was Re-engineered and Replaced without anyone Noticing.

Dr Andrew Norman (Fermilab)

16/04/2015, 12:15

Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

oral presentation

As high energy physics experiments have grown, their operational needs and requirements they place on computing systems change. These changes often require new technical solutions to meet the increased demands and functionalities of the science. How do you affect sweeping change to core infrastructure, without causing major interruptions to the scientific programs? This paper explores the...

Building timetable...