Claudio Kopper
14/10/2013, 13:30
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The IceCube Neutrino Observatory is a cubic kilometer-scale neutrino detector built into the ice sheet at the geographic South Pole. Light propagation in glacial ice is an important component of IceCube detector simulation that requires a large number of embarrassingly parallel calculations. The IceCube collaboration recently began using GPUs in order to simulate direct propagation of...
Philippe Canal
(Fermi National Accelerator Lab. (US))
14/10/2013, 13:53
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
We will present massively parallel high energy electromagnetic particle transportation through a finely segmented detector in the Graphic Processor Unit (GPU). Simulating events of energetic particle decay in a general-purpose high energy physics (HEP) detector requires intensive computing resources, due to the complexity of the geometry as well as physics processes applied to particles...
Qiming Lu
(Fermi National Accelerator Laboratory)
14/10/2013, 14:16
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Synergia is a parallel, 3-dimensional space-charge particle-in-cell code that is widely used by the accelerator modeling community. We present our work of porting the pure MPI-based code to a hybrid of CPU and GPU computing kernels. The hybrid code uses the CUDA platform, in the same framework as the pure MPI solution. We have implemented a lock-free collaborative charge-deposition algorithm...
Dr
Tareq AbuZayyad
(University of Utah)
14/10/2013, 14:39
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The Telescope Array Cosmic Rays Detector located in the Western Utah Desert is used for the observation of ultra-high energy cosmic rays. The simulation of a fluorescence detector response to cosmic rays initiated air showers presents many opportunities for parallelization. In this presentation we report on the Monte Carlo program used for the simulation of the Telescope Array fluorescence...
Dr
Peter Elmer
(Princeton University (US))
14/10/2013, 15:45
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Modern HEP software stacks, such as those used by the LHC experiments
at CERN, involve many millions of lines of custom code per experiment,
as well as a number of similarly sized shared packages (ROOT, Geant4,
etc.) Thousands of people have made contributions over time to these
code bases, including graduate students, postdocs, professional
researchers and software/computing...
Jim Kowalkowski
(Fermilab)
14/10/2013, 16:07
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
For nearly two decades, the C++ programming language has been the
dominant programming language for experimental HEP. The publication of
ISO/IEC 14882:2011, the current version of the international standard
for the C++ programming language, makes available a variety of language
and library facilities for improving the robustness, expressiveness, and
computational efficiency of C++ code....
Stefan Lohn
(CERN)
14/10/2013, 16:29
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Software optimization is a complex process, where the intended improvements have different effects on different platforms, with multiple operating systems and an ongoing introduction of new hardware. In addition several compilers produce differing object-code as result of different internal optimization procedures. To trace back the impact of the optimizations is going to become more...
Wim Lavrijsen
(Lawrence Berkeley National Lab. (US))
14/10/2013, 16:50
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The Python programming language brings a dynamic, interactive environment to physics analysis. With PyPy high performance can be delivered as well, when making use of its tracing just in time compiler (JIT) and cppyy for C++ bindings, as cppyy is able to exploit common HEP coding patterns. For example, ROOT I/O with cppyy runs at speeds equal to that of optimized, hand-tuned C++.
Python does...
Pascal Costanza
(ExaScience Lab, Intel, Belgium)
14/10/2013, 17:25
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Using Intel's SIMD architecture (SSE, AVX) to speed up operations on containers of complex class and structure objects is challenging, because it requires that the same data members of the different objects within a container have to be laid out next to each other, in a structure of arrays (SOA) fashion. Currently, programming languages do not provide automatic ways for arranging containers as...
Andrei Gheata
(CERN)
14/10/2013, 17:46
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Among the components contributing to particle transport, geometry navigation is an important consumer of CPU cycles. The tasks performed to get answers to "basic" queries like locating a point within a geometry hierarchy or computing accurately the distance to the next boundary can become very computing intensive for complex detector setups. Among several optimization methods already in use by...
Dr
Peter Elmer
(Princeton University (US))
15/10/2013, 13:30
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
In the last decade power limitations led to the introduction of
multicore CPU's. The cores on the processors were however not
dramatically different from the processors just before the
multicore-era. In some sense, this was merely a tactical choice to
maximize compatibility and buy time. The same scaling problems that
led to the power limit are likely to push processors in the...
Sebastiano Schifano
(U)
15/10/2013, 13:53
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
An interesting evolution in scientific computing is represented by the
streamline introduction of co-processor boards that were originally built to
accelerate graphics rendering and that are now being used to perform
general computing tasks. A peculiarity of these boards (GPGPU, or
General Purpose Graphic Processing Units, and many-core boards like
the Intel Xeon Phi) is that they...
Daniel Funke
(KIT - Karlsruhe Institute of Technology (DE))
15/10/2013, 14:16
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) at CERN near Geneva/Switzerland is a general-purpose particle detector which led, among many other results, to the discovery of a Higgs-like particle in 2012. It comprises the largest silicon-based tracking system built to date with 75 million individual readout channels and a total surface area of 205 m^2.
The...
Rolf Edward Andreassen
(University of Cincinnati (US))
15/10/2013, 14:38
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
We present a general framework for maximum-likelihood fitting, in which GPUs are used to massively parallelise the per-event probability calculation. For realistic physics fits we achieve speedups, relative to executing the same algorithm on a single CPU, of several hundred.
Vardan Gyurjyan
(Jefferson Lab)
15/10/2013, 15:45
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The majority of developed physics data processing applications (PDP) are single, sequential processes that start at a point in time, and advance one step at a time until they are finished. In the current era of cloud computing and multi-core hardware architectures this approach has noticeable limitations.
In this paper we present a detailed evaluation of the FBP-based Clas12 event...
Niko Neufeld
(CERN)
15/10/2013, 16:07
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The ARM architecture is a power-efficient design that is used in most processors in mobile devices all around the world today since they provide reasonable compute performance per watt. The current LHCb software stack is designed (and expected) to build and run on machines with the x86/x86_64 architecture. This paper outlines the process of measuring the performance of the LHCb software stack...
Mr
Davide Salomoni
(INFN CNAF), Dr
Elisabetta Ronchieri
(INFN CNAF), Mr
Marco Canaparo
(INFN CNAF), Mr
Vincenzo Ciaschini
(INFN CNAF)
15/10/2013, 16:29
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Software packages in our scientific environment are constantly growing in size, and are written by any number of developers. This implies a strong churn on the code itself, and an associated
risk of bugs and stability problems. This risk is unavoidable as long as the software undergoes active evolution, as it always happens with software that is still in use. However, the necessity of having...
Michal Husejko
(CERN)
15/10/2013, 16:51
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
This contribution describes how CERN has designed and integrated multiple essential tools for agile software development processes, ranging from a version control (Git) to issue tracking (Jira) and documentation (Wikis).
Running such services in a large organisation like CERN requires many administrative actions both by users and the service providers, such as creating software projects,...
Fons Rademakers
(CERN)
15/10/2013, 17:25
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The parametric function classes of ROOT (TFormula and TF1) have been improved using the capabilities of Cling/LLVM. We will present how formula expressions can now be compiled on the fly using the just-in-time capabilities of LLVM/Cing. Furthermore using the new features of C++ 11, one can build complex function expressions by re-using the existing mathematical functions. We will show also the...
Danilo Piparo
(CERN)
15/10/2013, 17:47
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
During the first four years of data taking at the Large Hadron Collider (LHC), the simulation and reconstruction programs of the experiments proved to be extremely resource consuming. In particular, for complex event simulation and reconstruction applications, the impact of evaluating elementary functions on the runtime is sizeable (up to one fourth of the total), with an obvious effect on the...
Andrzej Nowak
(CERN)
17/10/2013, 11:00
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
This paper summarizes the five years of CERN openlabโs efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization...
Danilo Piparo
(CERN)
17/10/2013, 11:22
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The necessity for really thread-safe experiment software has recently become very evident, largely driven by the evolution of CPU architectures towards exploiting increasing levels of parallelism, For high-energy physics this represents a real paradigm shift, as concurrent programming was previously only limited to special, well-defined domains like control software or software framework...
Prof.
Peter Hobson
(Brunel University (GB)), Dr
raul lopes
(School of Design and Engineering - Brunel University, UK)
17/10/2013, 11:44
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Variations of kd-trees represent a fundamental data structure frequently used in geometrical algorithms, Computational Statistics, and clustering. They have numerous applications, for example in track fitting, in the software of the LHC experiments and in physics in general. Computer simulations of N-body systems, for example, have seen applications in the study of dynamics of interacting...
Dr
Federico Carminati
(CERN)
17/10/2013, 12:06
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes...
Mario Lassnig
(CERN)
17/10/2013, 13:30
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
Rucio is the next-generation data management system supporting ATLAS physics workflows in the coming decade. The software engineering process to develop Rucio is fundamentally different to existing software development approaches in the ATLAS distributed computing community. Based on a conceptual design document, development takes place using peer-reviewed code in a test-driven environment,...
Rocco Mandrysch
(University of Iowa (US))
17/10/2013, 13:53
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
In a complex multi-developer, multi-package software environment, such as the ATLAS offline Athena framework, tracking the performance of the code can be a non-trivial task in itself. In this paper we describe improvements in the instrumentation of ATLAS offline software that have given considerable insight into the performance of the code and helped to guide optimisation.
Code can be...
Mr
Giulio Eulisse
(Fermi National Accelerator Lab. (US))
17/10/2013, 14:16
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
CMS Offline Software, CMSSW, is an extremely large software project, with roughly 3 millions lines of code, two hundreds of active developers and two to three active development branches. Given the scale of the problem, both from a technical and a human point of view, being able to keep on track such a large project, bug free, and to deliver builds for different architectures is a challenge in...
Mr
Dennis Van Dok
(Nikhef (NL))
17/10/2013, 14:38
Software Engineering, Parallelism & Multi-Core
Oral presentation to parallel session
The LCMAPS family of grid middleware has improved in the last years by moving from a custom build system to open source community standards for building, packaging and distributing. This contribution outlines what improvements were made and the benefits they rendered.
LCMAPS, gLExec and related middleware components were developed under a series of European framework program projects,...