Access and exploitation of large scale computing resources, such as those offered by general
purpose HPC centres, is one import measure for ATLAS and the other Large Hadron Collider experiments
in order to meet the challenge posed by the full exploitation of the future data within the constraints of flat budgets.
We report on the effort moving the Swiss WLCG T2 computing,
serving ATLAS, CMS...
Based on GooFit, a GPU-friendly framework for doing maximum-likelihood fits, we have developed a tool for extracting model-independent S-wave amplitudes from three-body decays such as D+ --> h(')-,h+,h+. A full amplitude analysis is done where the magnitudes and phases of the S-wave amplitudes (or alternatively, the real and imaginary components), are anchored at a finite number of...
The SND detector takes data at the e+e- collider VEPP-2000 in Novosibirsk. We present here
recent upgrades of the SND DAQ system which are mainly aimed to handle the enhanced events
rate load after the collider modernization. To maintain acceptable events selection quality the electronics
throughput and computational power should be increased. These goals are achieved with the new fast...
The installation of Virtual Visit services by the LHC collaborations began shortly after the first high energy collisions were provided by the CERN accelerator in 2010. The experiments: ATLAS, CMS, LHCb, and ALICE have all joined in this popular and effective method to bring the excitement of scientific exploration and discovery into classrooms and other public venues around the world. Their...
CERN has been developing and operating EOS as a disk storage solution successfully for 5 years. The CERN deployment provides 135 PB and stores 1.2 billion replicas distributed over two computer centres. Deployment includes four LHC instances, a shared instance for smaller experiments and since last year an instance for individual user data as well. The user instance represents the backbone of...
Fifteen Chinese High Performance Computing sites, many of them on the TOP500 list of most powerful supercomputers, are integrated into a common infrastructure providing coherent access to a user through an interface based on a RESTful interface called SCEAPI. These resources have been integrated into the ATLAS Grid production system using a bridge between ATLAS and SCEAPI which translates the...
Since the launch of HiggsHunters.org in November 2014, citizen science volunteers
have classified more than a million points of interest in images from the ATLAS experiment
at the LHC. Volunteers have been looking for displaced vertices and unusual features in images
recorded during LHC Run-1. We discuss the design of the project, its impact on the public,
and the surprising results of how...
Obtaining CPU cycles on an HPC cluster is nowadays relatively simple and sometimes even cheap for academic institutions. However, in most of the cases providers of HPC services would not allow changes on the configuration, implementation of special features or a lower-level control on the computing infrastructure and networks, for example for testing new computing patterns or conducting...
The LHC will collide protons in the ATLAS detector with increasing luminosity through 2016, placing stringent operational and physical requirements to the ATLAS trigger system in order to reduce the 40 MHz collision rate to a manageable event storage rate of about 1 kHz, while not rejecting interesting physics events. The Level-1 trigger is the first rate-reducing step in the ATLAS trigger...
The instantaneous luminosity of the LHC is expected to increase at HL-LHC so that the amount of pile-up can reach a level of 200 interaction per bunch crossing, almost a factor of 10 w.r.t the luminosity reached at the end of run 1. In addition, the experiments plan a 10-fold increase of the readout rate. This will be a challenge for the ATLAS and CMS experiments, in particular for the...
All four of the LHC experiments depend on web proxies (that is, squids) at each grid site in order to support software distribution by the CernVM FileSystem (CVMFS). CMS and ATLAS also use web proxies for conditions data distributed through the Frontier Distributed Database caching system. ATLAS & CMS each have their own methods for their grid jobs to find out which web proxy to use for...
XRootD is a distributed, scalable system for low-latency file access. It is the primary data access framework for the high-energy physics community. One of the latest developments in the project has been to incorporate metalink and segmented file transfer technologies.
We report on the implementation of the metalink metadata format support within XRootD client. This includes both the CLI and...
ALICE HLT Run2 performance overview
M.Krzewicki for the ALICE collaboration
The ALICE High Level Trigger (HLT) is an online reconstruction and data compression system used in the ALICE experiment at CERN. Unique among the LHC experiments, it extensively uses modern coprocessor technologies like general purpose graphic processing units (GPGPU) and field programmable gate arrays (FPGA) in the...
The ATLAS computing model was originally designed as static clouds (usually national or geographical groupings of sites) around the
Tier 1 centers, which confined tasks and most of the data traffic. Since those early days, the sites' network bandwidth has
increased at O(1000) and the difference in functionalities between Tier 1s and Tier 2s has reduced. After years of manual,
intermediate...
For the previous decade, high performance, high capacity Open Source storage systems have been designed and implemented, accommodating the demanding needs of the LHC experiments. However, with the general move away from the concept of local computer centers, supporting their associated communities, towards large infrastructures, providing Cloud-like solutions to a large variety of different...
The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed...
Radiotherapy is planned with the aim of delivering a lethal dose of radiation to a tumour, while keeping doses to nearby healthy organs at an acceptable level. Organ movements and shape changes, over a course of treatment typically lasting four to eight weeks, can result in actual doses being different from planned. The UK-based VoxTox project aims to compute actual doses, at the level of...
ALICE HLT Cluster operation during ALICE Run 2
(Johannes Lehrbach) for the ALICE collaboration
ALICE (A Large Ion Collider Experiment) is one of the four major detectors located at the LHC at CERN, focusing on the study of heavy-ion collisions. The ALICE High Level Trigger (HLT) is a compute cluster which reconstructs the events and compresses the data in real-time. The data compression...
The use of up-to-date machine learning methods, including deep neural networks, running directly on raw data has significant potential in High Energy Physics for revealing patterns in detector signals and as a result improving reconstruction and the sensitivity of the final physics analyses. In this work, we describe a machine-learning analysis pipeline developed and operating at the National...
ZFS is a combination of file system, logical volume manager, and software raid system developed by SUN Microsystems for the Solaris OS. ZFS simplifies the administration of disk storage and on Solaris it has been well regarded for its high performance, reliability, and stability for many years. It is used successfully for enterprise storage administration around the globe, but so far on such...
The ALICE HLT uses a data transport framework based on the publisher subscriber message principle, which transparently handles the communication between processing components over the network and between processing components on the same node via shared memory with a zero copy approach.
We present an analysis of the performance in terms of maximum achievable data rates and event rates as well...
We present the new Invenio 3 digital library framework and demonstrate
its application in the field of open research data repositories. We
notably look at how the Invenio technology has been applied in two
research data services: (1) the CERN Open Data portal that provides
access to the approved open datasets and software of the ALICE, ATLAS,
CMS and LHCb collaborations; (2) the Zenodo...
LArSoft is an integrated, experiment-agnostic set of software tools for liquid argon (LAr) neutrino experiments
to perform simulation, reconstruction and analysis within Fermilab art framework.
Along with common algorithms, the toolkit provides generic interfaces and extensibility
that accommodate the needs of detectors of very different size and configuration.
To date, LArSoft has been...
The observation of neutrino oscillation provides evidence of physics beyond the standard model, and the precise measurement of those oscillations remains an important goal for the field of particle physics. Using two finely segmented liquid scintillator detectors located 14 mrad off-axis from the NuMI muon-neutrino beam, NOvA is in a prime position to contribute to precision measurements of...
A framework for performing a simplified particle physics data analysis has been created. The project analyses a pre-selected sample from the full 2011 LHCb data. The analysis aims to measure matter antimatter asymmetries. It broadly follows the steps in a significant LHCb publication where large CP violation effects are observed in charged B meson three-body decays to charged pions and kaons....
The LHCb software trigger underwent a paradigm shift before the start of Run-II. From being a system to select events for later offline reconstruction, it can now perform the event analysis in real-time, and subsequently decide which part of the event information is stored for later analysis.
The new strategy is only possible due to a major upgrade during the LHC long shutdown I (2012-2015)....
For a few years now, the artdaq data acquisition software toolkit has
provided numerous experiments with ready-to-use components which allow
for rapid development and deployment of DAQ systems. Developed within
the Fermilab Scientific Computing Division, artdaq provides data
transfer, event building, run control, and event analysis
functionality. This latter feature includes built-in...
In preparation for the XENON1T Dark Matter data acquisition, we have
prototyped and implemented a new computing model. The XENON signal and data processing
software is developed fully in Python 3, and makes extensive use of generic scientific data
analysis libraries, such as the SciPy stack. A certain tension between modern “Big Data”
solutions and existing HEP frameworks is typically...
We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges.
Categories: Workflows can now be divided into categories...
Brookhvaven National Laboratory (BNL) anticipates significant growth in scientific programs with large computing and data storage needs in the near future and has recently re-organized support for scientific computing to meet these needs.
A key component is the enhanced role of the RHIC-ATLAS Computing Facility
(RACF)in support of high-throughput and high-performance computing (HTC and HPC) ...
We present the Web-Based Monitoring project of the CMS experiment at the LHC at CERN. With the growth in size and complexity of High Energy Physics experiments and the accompanying increase in the number of collaborators spread across the globe, the importance of broadly accessible monitoring has grown. The same can be said about the increasing relevance of operation and reporting web tools...
In the near future, many new experiments (JUNO, LHAASO, CEPC, etc) with challenging data volume are coming into operations or are planned in IHEP, China. The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment to be operational in 2019. The Large High Altitude Air Shower Observatory (LHAASO) is oriented to the study and observation of cosmic rays, which is...
The CMS experiment has collected an enormous volume of metadata about its computing operations in its monitoring systems, describing its experience in operating all of the CMS workflows on all of the Worldwide LHC Computing Grid Tiers. Data mining efforts into all these information have rarely been done, but are of crucial importance for a better understanding of how CMS did successful...
The data acquisition system (DAQ) of the CMS experiment at the CERN Large Hadron Collider assembles events at a rate of 100 kHz, transporting event data at an aggregate throughput of 100 GByte/s to the high-level trigger (HLT) farm. The HLT farm selects and classifies interesting events for storage and offline analysis at a rate of around 1 kHz.
The DAQ system has been redesigned during the...
ROOT is one of the core software tool for physicists. For more than a decade it has a central position in the physicists' analysis code and the experiments' frameworks thanks in parts to its stability and simplicity of use. This allowed software development for analysis and frameworks to use ROOT as a "common language" for HEP, across virtually all experiments.
Software development in...
In the present run of the LHC, CMS data reconstruction and simulation algorithms benefit greatly from being executed as multiple threads running on several processor cores. The complexity of the Run-2 events requires parallelization of the code in order to reduce the memory-per-core footprint constraining serial-execution programs, thus optimizing the exploitation of present multi-core...
The Belle II experiment at KEK is preparing for first collisions in 2017. Processing the large amounts of data that will be produced will require conditions data to be readily available to systems worldwide in a fast and efficient manner that is straightforward for both the user and maintainer.
The Belle II conditions database was designed with a straightforward goal: make it as easily...
This paper introduces the evolution of the monitoring system of the Alpha Magnetic Spectrometer (AMS) Science Operation Center (SOC) at CERN.
The AMS SOC monitoring system includes several independent tools: Network Monitor to poll the health metrics of AMS local computing farm, Production Monitor to show the production status, Frame Monitor to record the flight data arriving status, and...
The INFN CNAF Tier-1 computing center is composed by 2 different main rooms containing IT resources and 4 additional locations that hosts the necessary technology infrastructures providing the electrical power and refrigeration to the facility. The power supply and continuity are ensured by a dedicated room with three 15,000 to 400 V transformers in a separate part of the principal building...
ROOT version 6 comes with a C++ compliant interpreter cling. Cling needs to know everything about the code in libraries to be able to interact with them.
This translates into increased memory usage with respect to previous versions of
ROOT.
During the runtime automatic library loading process, ROOT6 re-parses a
set of header files, which describe the library; and enters "recursive"...
Support for Online Calibration in the ALICE HLT Framework
Mikolaj Krzewicki, for the ALICE collaboration
ALICE (A Large Heavy Ion Experiment) is one of the four major experiments at the Large Hadron Collider (LHC) at CERN. The High Level Trigger (HLT) is an online compute farm, which reconstructs events measured by the ALICE detector in real-time. The HLT uses a custom online...
Since 2014, the RAL Tier 1 has been working on deploying a Ceph backed object store. The aim is to replace Castor for disk storage. This new service must be scalable to meet the data demands of the LHC to 2020 and beyond. As well as offering access protocols the LHC experiments currently use, it must also provide industry standard access protocols. In order to keep costs down the service...
Dependability, resilience, adaptability, and efficiency. Growing requirements require tailoring storage services and novel solutions. Unprecedented volumes of data coming from the detectors need to be quickly available in a highly scalable way for large-scale processing and data distribution while in parallel they are routed to tape for long-term archival. These activities are critical for the...
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments.
The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In...
Since the 2014 the ATLAS and CMS experiments share a common vision for the Condition Database infrastructure required to handle the non-event data for the forthcoming LHC runs. The large commonality in the use cases allows to agree on a common overall design solution meeting the requirements of both experiments. A first prototype implementing these solutions has been completed in 2015 and was...
The GridPP project in the UK has a long-standing policy of supporting non-LHC VOs with 10% of the provided resources. Up until recently this had only been taken up be a very limited set of VOs, mainly due to a combination of the (perceived) large overhead of getting started, the limited computing support within non-LHC VOs and the ability to fulfill their computing requirements on local batch...
LHCb has introduced a novel real-time detector alignment and calibration strategy for LHC Run 2. Data collected at the start of the fill are processed in a few minutes and used to update the alignment parameters, while the calibration constants are evaluated for each run. This procedure improves the quality of the online reconstruction. For example, the vertex locator is retracted and...
The CERN Control and Monitoring Platform (C2MON) is a modular, clusterable framework designed to meet a wide range of monitoring, control, acquisition, scalability and availability requirements. It is based on modern Java technologies and has support for several industry-standard communication protocols. C2MON has been reliably utilised for several years as the basis of multiple monitoring...
This work will present the status of Ceph-related operations and development within the CERN IT Storage Group: we summarise significant production experience at the petabyte scale as well as strategic developments to integrate with our core storage services. As our primary back-end for OpenStack Cinder and Glance, Ceph has provided reliable storage to thousands of VMs for more than 3 years;...
Conditions data (for example: alignment, calibration, data quality) are used extensively in the processing of real and simulated data in ATLAS. The volume and variety of the conditions data needed by different types of processing are quite diverse, so optimizing its access requires a careful understanding of conditions usage patterns. These patterns can be quantified by mining representative...
The exploitation of the full physics potential of the LHC experiments requires fast and efficient processing of the largest possible dataset with the most refined understanding of the detector conditions. To face this challenge, the CMS collaboration has setup an infrastructure for the continuous unattended computation of the alignment and calibration constants, allowing for a refined...
Notebooks represent an exciting new approach that will considerably facilitate collaborative physics analysis.
They are a modern and widely-adopted tool to express computational narratives comprising, among other elements, rich text, code and data visualisations. Several notebook flavours exist, although one of them has been particularly successful: the Jupyter open source project.
In this...
CRAB3 is a workload management tool used by more than 500 CMS physicists every month to analyze data acquired by the Compact Muon Solenoid (CMS) detector at the CERN Large Hadron Collider (LHC). CRAB3 allows users to analyze a large collection of input files (datasets), splitting the input into multiple Grid jobs depending on parameters provided by users.
The process of manually specifying...
The ATLAS EventIndex System has amassed a set of key quantities for a large number of ATLAS events into a Hadoop based infrastructure for the purpose of providing the experiment with a number of event-wise services. Collecting this data in one place provides the opportunity to investigate various storage formats and technologies and assess which best serve the various use cases as well as...
CEPH is a cutting edge, open source, self-healing distributed data storage technology which is exciting both the enterprise and academic worlds. CEPH delivers an object storage layer (RADOS), block storage layer, and file system storage in a single unified system. CEPH object and block storage implementations are widely used in a broad spectrum of enterprise contexts, from dynamic provision of...
The WLCG Tier-1 center GridKa is developed and operated by the Steinbuch Centre for Computing (SCC)
at the Karlsruhe Institute of Technology (KIT). It was the origin of further Big Data research activities and
infrastructures at SCC, e.g. the Large Scale Data Facility (LSDF), providing petabyte scale data storage
for various non-HEP research communities.
Several ideas and plans...
ROOT provides advanced statistical methods needed by the LHC experiments to analyze their data. These include machine learning tools for classification, regression and clustering. TMVA, a toolkit for multi-variate analysis in ROOT, provides these machine learning methods.
We will present new developments in TMVA, including parallelisation, deep-learning neural networks, new features and...
AsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB3) that manages users’ transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user's output data to a single remote site was part of the job execution, resulting in inefficient...
ROOT provides an extremely flexible format used throughout the HEP community. The number of use cases – from an archival data format to end-stage analysis – has required a number of tradeoffs to be exposed to the user. For example, a high “compression level” in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU...
The ATLAS High Level Trigger Farm consists of around 30,000 CPU cores which filter events at up to 100 kHz input rate.
A costing framework is built into the high level trigger, this enables detailed monitoring of the system and allows for data-driven predictions to be made
utilising specialist datasets. This talk will present an overview of how ATLAS collects in-situ monitoring data on both...
We will report on the first year of the OSiRIS project (NSF Award #1541335, UM, IU, MSU and WSU) which is targeting the creation of a distributed Ceph storage infrastructure coupled together with software-defined networking to provide high-performance access for well-connected locations on any participating campus. The project’s goal is to provide a single scalable, distributed storage...
We present rootJS, an interface making it possible to seamlessly integrate ROOT 6 into applications written for Node.js, the JavaScript runtime platform increasingly commonly used to create high-performance Web applications. ROOT features can be called both directly from Node.js code and by JIT-compiling C++ macros. All rootJS methods are invoked asynchronously and support callback functions,...
At the RAL Tier-1 we have been deploying production services on both bare metal and a variety of virtualisation platforms for many years. Despite the significant simplification of configuration and deployment of services due to the use of a configuration management system, maintaining services still requires a lot of effort. Also, the current approach of running services on static machines...
HEP applications perform an excessive amount of allocations/deallocations within short time intervals which results in memory churn, poor locality and performance degradation. These issues are already known for a decade, but due to the complexity of software frameworks and the large amount of allocations (which are in the order of billions for a single job), up until recently no efficient...
The Geant4 Collaboration released a new generation of the Geant4 simulation toolkit (version 10) in December 2013 and reported its new features at CHEP 2015. Since then, the Collaboration continues to improve its physics and computing performance and usability. This presentation will survey the major improvements made since version 10.0. On the physics side, it includes fully revised multiple...
We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to...
The ATLAS experiment at CERN is planning a second phase of upgrades to prepare for the "High Luminosity LHC", a 4th major run due to start in 2026. In order to deliver an order of magnitude more data than previous runs, 14 TeV protons will collide with an instantaneous luminosity of 7.5 × 1034 cm−2s−1, resulting in much higher pileup and data rates than the current experiment was designed to...
The recent progress in parallel hardware architectures with deeper
vector pipelines or many-cores technologies brings opportunities for
HEP experiments to take advantage of SIMD and SIMT computing models.
Launched in 2013, the GeantV project studies performance gains in
propagating multiple particles in parallel, improving instruction
throughput and data locality in HEP event simulation....
A status of recent developments of the DELPHES C++ fast detector simulation framework will be given. New detector cards for the LHCb detector and prototypes for future e+ e- (ILC, FCC-ee) and p-p colliders at 100 TeV (FCC-hh) have been designed. The particle-flow algorithm has been optimised for high multiplicity environments such as high luminosity and boosted regimes. In addition, several...
The HEP prototypical systems at the Supercomputing conferences each year have served to illustrate the ongoing state of the art developments in high throughput, software-defined networked systems important for future data operations at the LHC and for other data intensive programs. The Supercomputing 2015 SDN demonstration revolved around an OpenFlow ring connecting 7 different booths and the...
The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. Total resources at Tier-1 and Tier-2 sites pledged to CMS exceed 100,000 CPU cores, and another 50,000-100,000 CPU cores are available opportunistically, pushing the needs of the...
After the Phase-I upgrade and onward, the Front-End Link eXchange (FELIX) system will be the interface between the data handling system and the detector front-end electronics and trigger electronics at the ATLAS experiment. FELIX will function as a router between custom serial links and a commodity switch network which will use standard technologies (Ethernet or Infiniband) to communicate with...
As the ATLAS Experiment prepares to move to a multi-threaded framework
(AthenaMT) for Run3, we are faced with the problem of how to migrate 4
million lines of C++ source code. This code has been written over the
past 15 years and has often been adapted, re-written or extended to
the changing requirements and circumstances of LHC data taking. The
code was developed by different authors, many of...
In today's world of distributed scientific collaborations, there are many challenges to providing reliable inter-domain network infrastructure. Network operators use a combination of
active monitoring and trouble tickets to detect problems, but these are often ineffective at identifying issues that impact wide-area network users. Additionally, these approaches do not scale to wide area...
The LHCb detector will be upgraded for the LHC Run 3 and will be readout at 40 MHz, with major implications on the software-only trigger and offline computing. If the current computing model is kept, the data storage capacity and computing power required to process data at this rate, and to generate and reconstruct equivalent samples of simulated events, will exceed the current capacity by a...
The FabrIc for Frontier Experiments (FIFE) project is a major initiative within the Fermilab Scientific Computing Division charged with leading the computing model for Fermilab experiments. Work within the FIFE project creates close collaboration between experimenters and computing professionals to serve high-energy physics experiments of differing size, scope, and physics area. The FIFE...
Some data analysis methods typically used in econometric studies and in ecology have been evaluated and applied in physics software environments. They concern the evolution of observables through objective identification of change points and trends, and measurements of inequality, diversity and evenness across a data set. Within each one of these analysis areas, several statistical tests and...
The Open Science Grid (OSG) relies upon the network as a critical part of the distributed infrastructures it enables. In 2012 OSG added a new focus area in networking with a goal of becoming the primary source of network information for its members and collaborators. This includes gathering, organizing and providing network metrics to guarantee effective network usage and prompt detection and...
The SciDAC-Data project is a DOE funded initiative to analyze and exploit two decades of information and analytics that have been collected, by the Fermilab Data Center, on the organization, movement, and consumption of High Energy Physics data. The project is designed to analyze the analysis patterns and data organization that have been used by the CDF, DØ, NO𝜈A, Minos, Minerva and other...
The fraction of internet traffic carried over IPv6 continues to grow rapidly. IPv6 support from network hardware vendors and carriers is pervasive and becoming mature. A network infrastructure upgrade often offers sites an excellent window of opportunity to configure and enable IPv6.
There is a significant overhead when setting up and maintaining dual stack machines, so where possible...
The ALICE Collaboration and the ALICE O$^2$ project have carried out detailed studies for a new online computing facility planned to be deployed for Run 3 of the Large Hadron Collider (LHC) at CERN. Some of the main aspects of the data handling concept are partial reconstruction of raw data organized in so called time frames, and based on that information reduction of the data rate without...
High Energy Physics experiments have long had to deal with huge amounts of data. Other fields of study are now being faced with comparable volumes of experimental data and have similar requirements to organize access by a distributed community of researchers. Fermilab is partnering with the Simons Foundation Autism Research Initiative (SFARI) to adapt Fermilab’s custom HEP data management...
The second generation of the ATLAS production system called ProdSys2 is a
distributed workload manager that runs daily hundreds of thousands of jobs,
from dozens of different ATLAS specific workflows, across more than
hundred heterogeneous sites. It achieves high utilization by combining
dynamic job definition based on many criteria, such as input and output
size, memory requirements and...
The Compressed Baryonic Matter experiment (CBM) is a next-generation heavy-ion experiment to be operated at the FAIR facility, currently under construction in Darmstadt, Germany. A key feature of CBM are very high intercation rates, exceeding those of contemporary nuclear collision experiments by several orders of magnitude. Such interaction rates forbid a conventional, hardware-triggered...
The Electron-Ion Collider (EIC) is envisioned as the
next-generation U.S. facility to study quarks and gluons in
strongly interacting matter. Developing the physics program for
the EIC, and designing the detectors needed to realize it,
requires a plethora of software tools and multifaceted analysis
efforts. Many of these tools have yet to be developed or need to
...
The LHCb experiment will undergo a major upgrade during the second long shutdown (2018 - 2019). The upgrade will concern both the detector and the Data Acquisition (DAQ) system, to be rebuilt in order to optimally exploit the foreseen higher event rate. The Event Builder (EB) is the key component of the DAQ system which gathers data from the sub-detectors and build up the whole event. The EB...
The goal of the comparison is to summarize the state-of-the-art techniques of deep learning which is boosted with modern GPUs. Deep learning, which is also known as deep structured learning or hierarchical learning, is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers composed of multiple...
We report current status of the CMS full simulation. For run-II CMS is using Geant4 10.0p02 built in sequential mode. About 8 billion events are produced in 2015. In 2016 any extra production will be done using the same production version. For the development Geant4 10.0p03 with CMS private patches built in multi-threaded mode were established. We plan to use newest Geant4 10.2 for 2017...
We present an implementation of the ATLAS High Level Trigger that provides parallel execution of trigger algorithms within the ATLAS multithreaded software framework, AthenaMT. This development will enable the ATLAS High Level Trigger to meet future challenges due to the evolution of computing hardware and upgrades of the Large Hadron Collider, LHC, and ATLAS Detector. During the LHC...
The LHC is the world's most powerful particle accelerator, colliding protons at centre of mass energy of 13 TeV. As the
energy and frequency of collisions has grown in the search for new physics, so too has demand for computing resources needed for
event reconstruction. We will report on the evolution of resource usage in terms of CPU and RAM in key ATLAS offline
reconstruction workflows at...
With the increased load and pressure on required computing power brought by the higher luminosity in LHC during Run2, there is a need to utilize opportunistic resources not currently dedicated to the Compact Muon Solenoid (CMS) collaboration. Furthermore, these additional resources might be needed on demand. The Caltech group together with the Argonne Leadership Computing Facility (ALCF) are...
RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. Originally meant to be a front side bus, it developed into a system level interconnect which is today used in all 4G/LTE base stations world wide. RapidIO is often used in embedded systems that require high reliability, low latency and scalability in a...
The ATLAS EventIndex has been running in production since mid-2015,
reliably collecting information worldwide about all produced events and storing
them in a central Hadoop infrastructure at CERN. A subset of this information
is copied to an Oracle relational database for fast access.
The system design and its optimization is serving event picking from requests of
a few events up to scales of...
HPC network technologies like Infiniband, TrueScale or OmniPath provide low-
latency and high-throughput communication between hosts, which makes them
attractive options for data-acquisition systems in large-scale high-energy
physics experiments. Like HPC networks, data acquisition networks are local
and include a well specified number of systems. Unfortunately traditional...
The LHCb experiment stores around 10^11 collision events per year. A typical physics analysis deals with a final sample of up to 10^7 events. Event preselection algorithms (lines) are used for data reduction. They are run centrally and check whether an event is useful for a particular physical analysis. The lines are grouped into streams. An event is copied to all the streams its lines belong,...
The ATLAS Simulation infrastructure has been used to produce upwards of 50 billion proton-proton collision events for analyses
ranging from detailed Standard Model measurements to searches for exotic new phenomena. In the last several years, the
infrastructure has been heavily revised to allow intuitive multithreading and significantly improved maintainability. Such a
massive update of a...
In the midst of the multi- and many-core era, the computing models employed by
HEP experiments are evolving to embrace the trends of new hardware technologies.
As the computing needs of present and future HEP experiments -particularly those
at the Large Hadron Collider- grow, adoption of many-core architectures and
highly-parallel programming models is essential to prevent degradation...
The ATLAS experiment at the high-luminosity LHC will face a five-fold
increase in the number of interactions per collision relative to the ongoing
Run 2. This will require a proportional improvement in rejection power at
the earliest levels of the detector trigger system, while preserving good signal efficiency.
One critical aspect of this improvement will be the implementation of
precise...
Changes in the trigger menu, the online algorithmic event-selection of the ATLAS experiment at the LHC in response to luminosity and detector changes are followed by adjustments in their monitoring system. This is done to ensure that the collected data is useful, and can be properly reconstructed at Tier-0, the first level of the computing grid. During Run 1, ATLAS deployed monitoring updates...
PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive...
The Canadian Advanced Network For Astronomical Research (CANFAR)
is a digital infrastructure that has been operational for the last
six years.
The platform allows astronomers to store, collaborate, distribute and
analyze large astronomical datasets. We have implemented multi-site storage and
in collaboration with an HEP group at University of Victoria, multi-cloud processing.
CANFAR is deeply...
In recent years there has been increasing use of HPC facilities for HEP experiments. This has initially focussed on less I/O intensive workloads such as generator-level or detector simulation. We now demonstrate the efficient running of I/O-heavy ‘analysis’ workloads for the ATLAS and ALICE collaborations on HPC facilities at NERSC, as well as astronomical image analysis for DESI.
To do...
Software for the next generation of experiments at the Future Circular Collider (FCC), should by design efficiently exploit the available computing resources and therefore support of parallel execution is a particular requirement. The simulation package of the FCC Common Software Framework (FCCSW) makes use of the Gaudi parallel data processing framework and external packages commonly used in...
MonALISA, which stands for Monitoring Agents using a Large Integrated Services Architecture, has been developed over the last fourteen years by Caltech and its partners with the support of the CMS software and computing program. The framework is based on Dynamic Distributed Service Architecture and is able to provide complete monitoring, control and global optimization services for complex...
The High Luminosity LHC (HL-LHC) will deliver luminosities of up to 5x10^34 cm^2/s, with an average of about 140-200 overlapping proton-proton collisions per bunch crossing. These extreme pileup conditions can significantly degrade the ability of trigger systems to cope with the resulting event rates. A key component of the HL-LHC upgrade of the CMS experiment is a Level-1 (L1) track...
Abstract: Southeast University Science Operation Center (SEUSOC) is one of the computing centers of the Alpha Magnetic Spectrometer (AMS-02) experiment. It provides 2000 CPU cores for AMS scientific computing and a dedicated 1Gbps Long Fat Network (LFN) for AMS data transmission between SEU and CERN. In this paper, the workflows of SEUSOC Monte Carlo (MC) production are discussed in...
or some physics processes studied with the ATLAS detector, a more
accurate simulation in some respects can be achieved by including real
data into simulated events, with substantial potential improvements in the CPU,
disk space, and memory usage of the standard simulation configuration,
at the cost of significant database and networking challenges.
Real proton-proton background events can be...
Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was...
The ATLAS Distributed Data Management (DDM) system has evolved drastically in the last two years with the Rucio software fully
replacing the previous system before the start of LHC Run-2. The ATLAS DDM system manages now more than 200 petabytes spread on 130
storage sites and can handle file transfer rates of up to 30Hz. In this talk, we discuss our experience acquired in...
Physics analysis at the Compact Muon Solenoid (CMS) requires both a vast production of simulated events and an extensive processing of the data collected by the experiment.
Since the end of the LHC runI in 2012, CMS has produced over 20 Billion simulated events, from 75 thousand processing requests organised in one hundred different campaigns, which emulate different configurations of...
Micropattern gaseous detector (MPGD) technologies, such as GEMs or MicroMegas, are particularly suitable for precision tracking and triggering in high rate environments. Given their relatively low production costs, MPGDs are an exemplary candidate for the next generation of particle detectors. Having acknowledged these advantages, both the ATLAS and CMS collaborations at the LHC are exploiting...
The long standing problem of reconciling the cosmological evidence of the existence of dark matter with the lack of any clear experimental observation of it, has recently revived the idea that the new particles are not directly connected with the Standard Model gauge fields, but only through mediator fields or ''portals'', connecting our world with new ''secluded'' or ''hidden'' sectors. One...
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended HPC’s reach from its roots in modeling and simulation of complex physical systems to a broad range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near...
The ATLAS Event Service (ES) has been designed and implemented for efficient
running of ATLAS production workflows on a variety of computing platforms, ranging
from conventional Grid sites to opportunistic, often short-lived resources, such
as spot market commercial clouds, supercomputers and volunteer computing.
The Event Service architecture allows real time delivery of fine grained...
Distributed data processing in High Energy and Nuclear Physics (HENP) is a prominent example of big data analysis. Having petabytes of data being processed at tens of computational sites with thousands of CPUs, standard job scheduling approaches either do not address well the problem complexity or are dedicated to one specific aspect of the problem only (CPU, network or storage). As a result, ...
Over the past two years, the operations at INFN-CNAF have undergone significant changes.
The adoption of configuration management tools, such as Puppet and the constant increase of dynamic and cloud infrastructures, have led us to investigate a new monitoring approach.
Our aim is the centralization of the monitoring service at CNAF through a scalable and highly configurable monitoring...
A lot of experiments in the field of accelerator based science are actively running at High Energy Accelerator Research Organization (KEK) by using SuperKEKB and J-PARC accelerator in Japan. In these days at KEK, the computing demand from the various experiments for the data processing, analysis and MC simulation is monotonically increasing. It is not only for the case with high-energy...
This contribution gives a report on the remote evaluation of the pre-production Intel Omni-Path (OPA) interconnect hardware and software performed by RHIC & ATLAS Computing Facility (RACF) at BNL in Dec 2015 - Feb 2016 time period using a 32 node “Diamond” cluster with a single Omni-Path Host Fabric Interface (HFI) installed on each and a single 48-port Omni-Path switch with the non-blocking...
IceProd is a data processing and management framework developed by the IceCube Neutrino Observatory for processing of Monte Carlo simulations, detector data, and analysis levels. It runs as a separate layer on top of grid and batch systems. This is accomplished by a set of daemons which process job workflow, maintaining configuration and status information on the job before, during, and after...
The low flux of the ultra-high energy cosmic rays (UHECR) at the highest energies provides a challenge to answer the long standing question about their origin and nature. Even lower fluxes of neutrinos with energies above 10^22 eV are predicted in certain Grand-Unifying-Theories (GUTs) and e.g. models for super-heavy dark matter (SHDM). The significant increase in detector volume required to...
Many physics and performance studies with the ATLAS detector at the Large Hadron Collider require very large samples of simulated events, and producing these using the full GEANT4 detector simulation is highly CPU intensive.
Often, a very detailed detector simulation is not needed, and in these cases fast simulation tools can be used
to reduce the calorimeter simulation time by a few orders...
The ALICE experiment at CERN was designed to study the properties of the strongly-interacting hot and dense matter created in heavy-ion collisions at the LHC energies. The computing model of the experiment currently relies on the hierarchical Tier-based structure, with a top-level Grid site at CERN (Tier-0, also extended to Wigner) and several globally distributed datacenters at national and...
In the ideal limit of infinite resources, multi-tenant applications are able to scale in/out on a Cloud driven only by their functional requirements. A large Public Cloud may be a reasonable approximation of this condition, where tenants are normally charged a posteriori for their resource consumption. On the other hand, small scientific computing centres usually work in a saturated regime...
The Computing Center of the Institute of Physics (CC IoP) of the Czech Academy of Sciences serves a broad spectrum of users with various computing needs. It runs WLCG Tier-2 center for the ALICE and the ATLAS experiments; the same group of services is used by astroparticle physics projects the Pierre Auger Observatory (PAO) and the Cherenkov Telescope Array (CTA). OSG stack is installed for...
The complex geometry of the whole detector of the ATLAS experiment at LHC is currently stored only in custom online databases, from which it is built on-the-fly on request. Accessing the online geometry guarantees accessing the latest version of the detector description, but requires the setup of the full ATLAS software framework "Athena", which provides the online services and the tools to...
The INFN Section of Turin hosts a middle-size multi-tenant cloud infrastructure optimized for scientific computing.
A new approach exploiting the features of VMDIRAC and aiming to allow for dynamic automatic instantiation and destruction of Virtual Machines from different tenants, in order to maximize the global computing efficiency of the infrastructure, has been designed, implemented and...
The ATLAS software infrastructure facilitates efforts of more than 1000
developers working on the code base of 2200 packages with 4 million C++
and 1.4 million python lines. The ATLAS offline code management system is
the powerful, flexible framework for processing new package versions
requests, probing code changes in the Nightly Build System, migration to
new platforms and compilers,...
ATLAS is a high energy physics experiment in the Large Hadron Collider
located at CERN.
During the so called Long Shutdown 2 period scheduled for late 2018,
ATLAS will undergo
several modifications and upgrades on its data acquisition system in
order to cope with the
higher luminosity requirements. As part of these activities, a new
read-out chain will be built
for the New Small Wheel muon...
Distributed computing infrastructures require automatic tools to strengthen, monitor and analyze the security behavior of computing devices. These tools should inspect monitoring data such as resource usage, log entries, traces and even processes' system calls. They also should detect anomalies that could indicate the presence of a cyber-attack. Besides, they should react to attacks without...
The engineering design of a particle detector is usually performed in a
Computer Aided Design (CAD) program, and simulation of the detector's performance
can be done with a Geant4-based program. However, transferring the detector
design from the CAD program to Geant4 can be laborious and error-prone.
SW2GDML is a tool that reads a design in the popular SolidWorks CAD
program and...
The Compact Muon Solenoid (CMS) experiment makes a vast use of alignment and calibration measurements in several data processing workflows: in the High Level Trigger, in the processing of the recorded collisions and in the production of simulated events for data analysis and studies of detector upgrades. A complete alignment and calibration scenario is factored in approximately three-hundred...
The Trigger and Data Acquisition system of the ATLAS detector at the Large Hadron
Collider at CERN is composed of a large number of distributed hardware and software
components (about 3000 machines and more than 25000 applications) which, in a coordinated
manner, provide the data-taking functionality of the overall system.
During data taking runs, a huge flow of operational data is produced...
Volunteer computing has the potential to provide significant additional computing capacity for the LHC experiments.
One of the challenges with exploiting volunteer computing is to support a global community of volunteers that provides heterogeneous resources.
However, HEP applications require more data input and output than the CPU intensive applications that are typically used by other...
As demand for widely accessible storage capacity increases and usage is on the rise, steady IO performance is desired but tends to suffer within multi-user environments. Typical deployments use standard hard drives as the cost per/GB is quite low. On the other hand, HDD based solutions for storage are not known to scale well with process concurrency and soon enough, high rate of IOPs create a...
GooFit, a GPU-friendly framework for doing maximum-likelihood fits, has been extended in functionality to do a full amplitude analysis of scalar mesons decaying into four final states via various combinations of intermediate resonances. Recurring resonances in different amplitudes are recognized and only calculated once, to save memory and execution time. As an example, this tool can be used...
The AMS data production uses different programming modules for job submission, execution and management, as well as for validation of produced data. The modules communicate with each other using CORBA interface. The main module is the AMS production server, a scalable distributed service which links all modules together starting from job submission request and ending with writing data to disk...
Efficient administration of computing centres requires advanced tools for the monitoring and front-end interface of their infrastructure. The large-scale distributed grid systems, like the Worldwide LHC Computing Grid (WLCG) and ATLAS computing, offer many existing web pages and information sources indicating the status of the services, systems, requests and user jobs at grid sites. These...
The pilot model employed by the ATLAS production system has been in use for many years. The model has proven to be a success, with many
advantages over push models. However one of the negative side-effects of using a pilot model is the presence of 'empty pilots' running
on sites, consuming a small amount of walltime and not running a useful payload job. The impact on a site can be significant,...
A new analysis category based on g4tools was added in Geant4 release 9.5 with the aim of providing users with a lightweight analysis tool available as part of the Geant4 installation without the need to link to an external analysis package. It has progressively replaced the usage of external tools based on AIDA (Abstract Interfaces for Data Analysis) in all Geant4 examples. Frequent questions...
Simulation of particle-matter interactions in complex geometries is one of
the main tasks in high energy physics (HEP) research. Geant4 is the most
commonly used tool to accomplish it.
An essential aspect of the task is an accurate and efficient handling
of particle transport and crossing volume boundaries within a
predefined (3D) geometry.
At the core of the Geant4 simulation toolkit,...
The distributed computing system in Institute of High Energy Physics (IHEP), China, is based on DIRAC middleware. It integrates about 2000 CPU cores and 500 TB storage contributed by 16 distributed cites. These sites are of various type, such as cluster, grid, cloud and volunteer computing. This system went into production status in 2012. Now it supports multi-VO and serves three HEP...
Previous research has shown that it is relatively easy to apply a simple shim to conventional WLCG storage interfaces, in order to add Erasure coded distributed resilience to data.
One issue with simple EC models is that, while they can recover from losses without needing additional full copies of data, recovery often involves reading the all of the distributed chunks of the file (and their...
Maintainability is a critical issue for large scale, widely used software systems, characterized by a long life cycle. It is of paramount importance for a software toolkit, such as Geant4, which is a key instrument for research and industrial applications in many fields, not limited to high energy physics.
Maintainability is related to a number of objective metrics associated with...
Consolidation towards more computing at flat budgets beyond what pure chip technology
can offer, is a requirement for the full scientific exploitation of the future data from the
Large Hadron Collider. One consolidation measure is to exploit cloud infrastructures whenever
they are financially competitive. We report on the technical solutions and the performance used
and achieved running...
The ATLAS Experiment at the LHC is recording data from proton-proton collisions with 13 TeV
center-of-mass energy since spring 2015. The ATLAS collaboration has set up, updated
and optimized a fast physics monitoring framework (TADA) to automatically perform a broad
range of validation and to scan for signatures of new physics in the rapidly growing data.
TADA is designed to provide fast...
The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence.
Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for
metadata aggregation and cataloguing. We briefly describe the architecture, the main services
and the benefits of using AMI in big collaborations, especially for high energy physics.
We focus on the...
The ATLAS experiment explores new hardware and software platforms that, in the future,
may be more suited to its data intensive workloads. One such alternative hardware platform
is the ARM architecture, which is designed to be extremely power efficient and is found
in most smartphones and tablets.
CERN openlab recently installed a small cluster of ARM 64-bit evaluation prototype servers....
The LHCb Vertex Locator (VELO) is a silicon strip semiconductor detector operating at just 8mm distance to the LHC beams. Its 172,000 strips are read at a frequency of 1 MHz and processed by off-detector FPGAs followed by a PC cluster that reduces the event rate to about 10 kHz. During the second run of the LHC, which lasts from 2015 until 2018, the detector performance will undergo continued...
The exploitation of volunteer computing resources has become a popular practice in the HEP computing community as the huge amount of potential computing power it provides. In the recent HEP experiments, the grid middleware has been used to organize the services and the resources, however it relies heavily on the X.509 authentication, which is contradictory to the untrusted feature of volunteer...
In this paper we explain how the C++ code quality is managed in ATLAS using a range of tools from compile-time through to run time testing and reflect on the substantial progress made in the last two years largely through the use of static analysis tools such as Coverity®, an industry-standard tool which enables quality comparison with general open source C++ code. Other available code...
CBM is a heavy-ion experiment at the future FAIR facility in
Darmstadt, Germany. Featuring self-triggered front-end electronics and
free-streaming read-out event selection will exclusively be done by
the First Level Event Selector (FLES). Designed as an HPC cluster,
its task is an online analysis and selection of
the physics data at a total input data rate exceeding 1 TByte/s. To
allow...
The CMS experiment collects and analyzes large amounts of data coming from high energy particle collisions produced by the Large Hadron Collider (LHC) at CERN. This involves a huge amount of real and simulated data processing that needs to be handled in batch-oriented platforms. The CMS Global Pool of computing resources provide +100K dedicated CPU cores and another 50K to 100K CPU cores from...
CMS deployed a prototype infrastructure based on Elastic Search that stores all classAds from the global pool. This includes detailed information on IO, CPU, datasets, etc. etc. for all analysis as well as production jobs. We will present initial results from analyzing this wealth of data, describe lessons learned, and plans for the future to derive operational benefits from analyzing this...
One of the primary objectives of the research on GEMs at CERN is the testing and simulation of prototypes, manufacturing of large-scale GEM detectors and installation into CMS detector sections at the outer layer, where only highly energetic muons particles are detected. When a muon particle traverses a GEM detector, it ionizes the gas molecules generating a freely moving electron that starts...
One of the difficulties experimenters encounter when using a modular event-processing framework is determining the appropriate configuration for the workflow they intend to execute. A typical solution is to provide documentation external to the C++ code source that explains how a given component of the workflow is to be configured. This solution is fragile, because the documentation and the...
Throughout the first year of LHC Run 2, ATLAS Cloud Computing has undergone
a period of consolidation, characterized by building upon previously established systems,
with the aim of reducing operational effort, improving robustness, and reaching higher scale.
This paper describes the current state of ATLAS Cloud Computing.
Cloud activities are converging on a common contextualization...
The Belle II experiment is the upgrade of the highly successful Belle experiment located at the KEKB asymmetric-energy e+e- collider at KEK in Tsukuba, Japan. The Belle experiment collected e+e- collision data at or near the centre-of-mass energies corresponding to $\Upsilon(nS)$ ($n\leq 5$) resonances between 1999 and 2010 with the total integrated luminosity of 1 ab$^{-1}$. The data...
The LHC has planned a series of upgrades culminating in the High Luminosity LHC (HL-LHC) which will have
an average luminosity 5-7 times larger than the nominal Run-2 value. The ATLAS Tile Calorimeter (TileCal) will
undergo an upgrade to accommodate to the HL-LHC parameters. The TileCal read-out electronics will be redesigned,
introducing a new read-out strategy.
The photomultiplier signals...
CERN has been archiving data on tapes in its Computer Center for decades and its archive system is now holding more than 135 PB of HEP data in its premises on high density tapes.
For the last 20 years, tape areal bit density has been doubling every 30 months, closely following HEP data growth trends. During this period, bits on the tape magnetic substrate have been shrinking exponentially;...
Data Flow Simulation of the ALICE Computing System with OMNET++
Rifki Sadikin, Furqon Hensan Muttaqien, Iosif Legrand, Pierre Vande Vyvre for the ALICE Collaboration
The ALICE computing system will be entirely upgraded for Run 3 to address the major challenge of sampling the full 50 kHz Pb-Pb interaction rate increasing by a factor 100 times the present limit. We present, in this...
This contribution reports on the feasibility of executing data intensive workflows on Cloud infrastructures. In order to assess this, the metric ETC = Events/Time/Cost is formed, which quantifies the different workflow and infrastructure configurations that are tested against each other.
In these tests ATLAS reconstruction Jobs are run, examining the effects of overcommitting (more parallel...
We review and demonstrate the design of efficient data transfer nodes (DTNs), from the perspectives of the highest throughput over both local and wide area networks, as well as the highest performance per unit cost. A careful system-level design is required for the hardware, firmware, OS and software components. Furthermore, additional tuning of these components, and the identification and...
The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence.
Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem
for metadata aggregation and cataloguing. AMI is used by the ATLAS production system,
therefore the service must guarantee a high level of availability. We describe our monitoring system
and the...
With many parts of the world having run out of IPv4 address space and the Internet Engineering Task Force (IETF) depreciating IPv4 the use of and migration to IPv6 is becoming a pressing issue. A significant amount of effort has already been expended by the HEPiX IPv6 Working Group (http://hepix-ipv6.web.cern.ch/) on testing dual-stacked hosts and IPv6-only CPU resources. The Queen Mary grid...
Abstract: Nowadays, the High Energy Physics experiments produce a large amount of data. These data is stored in massive storage system, which need to balance the cost, performance and manageability. HEP is a typical data-intensive application, and process a lot of data to achieve scientific discoveries. A hybrid storage system including SSD (Solid-state Drive) and HDD (Hard Disk Drive) layers...
Abstract: Monte Carlo (MC) simulation production plays an important part in physics analysis of the Alpha Magnetic Spectrometer (AMS-02) experiment. To facilitate the metadata retrieving for data analysis needs among the millions of database records, we developed a monitoring tool to analyze and visualize the production status and progress. In this paper, we discuss the workflow of the...
ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). A major upgrade of the experiment is planned for 2020. In order to cope with a data rate 100 times higher and with the continuous readout of the Time Projection Chamber (TPC), it is necessary to...
The growing use of private and public clouds, and volunteer computing are driving significant changes in the way large parts of the distributed computing for our communities are carried out. Traditionally HEP workloads within WLCG were almost exclusively run via grid computing at sites where site administrators are responsible for and have full sight of the execution environment. The...
The long standing problem of reconciling the cosmological evidence of the existence of dark matter with the lack of any clear experimental observation of it, has recently revived the idea that the new particles are not directly connected with the Standard Model gauge fields, but only through mediator fields or ''portals'', connecting our world with new ''secluded'' or ''hidden'' sectors. One...
The trigger system of the ATLAS detector at the LHC is a combination of hardware, firmware and software, associated to various sub-detectors that must seamlessly cooperate in order to select 1 collision of interest out of every 40,000 delivered by the LHC every millisecond. This talk will discuss the challenges, workflow and organization of the ongoing trigger software development, validation...
The new generation of high energy physics(HEP) experiments have been producing gigantic data. How to store and access those data with high performance have been challenging the availability, scalability, and I/O performance of the underlying massive storage system. At the same time, a series of researches focusing on big data have been more and more active, and the research about metadata...
Requests for computing resources from LHC experiments are constantly
mounting, and so are their peak usage. Since dimensioning
a site to handle the peak usage times is impractical due to
constraints on resources that many publicly-owned computing centres
have, opportunistic usage of resources from external, even commercial
cloud providers is becoming more and more interesting, and is even...
The CMS experiment at LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sifting tirelessly...
CRAB3 is a tool used by more than 500 users all over the world for distributed Grid analysis of CMS data. Users can submit sets of Grid jobs with similar requirements (tasks) with a single user request. CRAB3 uses a client-server architecture, where a lightweight client, a server, and ancillary services work together and are maintained by CMS operators at CERN.
As with most complex...
The computing infrastructures serving the LHC experiments have been
designed to cope at most with the average amount of data recorded. The
usage peaks, as already observed in Run-I, may however originate large
backlogs, thus delaying the completion of the data reconstruction and
ultimately the data availability for physics analysis. In order to
cope with the production peaks, the LHC...
The use of opportunistic cloud resources by HEP experiments has significantly increased over the past few years. Clouds that are owned or managed by the HEP community are connected to the LHCONE network or the research network with global access to HEP computing resources. Private clouds, such as those supported by non-HEP research funds are generally connected to the international...
The long standing problem of reconciling the cosmological evidence of the existence of dark matter with the lack of any clear experimental observation of it, has recently revived the idea that the new particles are not directly connected with the Standard Model gauge fields, but only through mediator fields or ''portals'', connecting our world with new ''secluded'' or ''hidden'' sectors. One...
The algorithms and infrastructure of the CMS offline software are under continuous change in order to adapt to a changing accelerator, detector and computing environment. In this presentation, we discuss the most important technical aspects of this evolution, the corresponding gains in performance and capability, and the prospects for continued software improvement in the face of challenges...
Ceph based storage solutions and especially object storage systems based on it are now well recognized and widely used across the HEP/NP community. Both object storage and block storage layers of Ceph are now supporting production ready services for HEP/NP experiments at many research organizations across the globe, including CERN and Brookhaven National Laboratory (BNL), and even the Ceph...
The researchers at the Google Brain team released their second generation Deep Learning library, TensorFlow, as an open-source package under the Apache 2.0 license in November, 2015. Google has already deployed the first generation library using DistBelief in various systems such as Google Search, advertising systems, speech recognition systems, Google Images, Google Maps, Street View, Google...
The High Luminosity LHC (HL-LHC) is a project to increase the luminosity of the Large Hadron Collider to 5*10^34 cm-2 s-1. The CMS experiment is planning a major upgrade in order to cope with an expected average number of overlapping collisions per bunch crossing of 140. The dataset sizes will increase by several orders of magnitude and so will be the request for larger computing...
We present a new experiment management system for the SND detector at the VEPP-2000 collider (Novosibirsk). Substantially, it includes as important part operator access to experimental databases (configuration, conditions and metadata).
The system is designed in client-server architecture. A user interacts with it via web-interface. The server side includes several logical layers: user...
The BESIII experiment located in Beijing is an electron-positron collision experiment to study Tau-Charm physics. Now in its middle age BESIII has aggregated more than 1PB raw data and the distributed computing system has been built up based on DIRAC and put into productions since 2012 to deal with peak demands. Nowadays cloud becomes popular ways to provide resources among BESIII...
The high precision experiment PANDA is specifically designed to shed new light on the structure and properties of hadrons. PANDA is a fixed target antiproton proton experiment and will be part of Facility for Antiproton and Ion Research (FAIR) in Darmstadt, Germany. When measuring the total cross sections or determining the properties of intermediate states very precisely e.g. via the energy...
Processing of the large amount of data produced by the ATLAS experiment requires fast and reliable access to what we call Auxiliary Data Files (ADF). These files, produced by Combined Performance, Trigger and Physics groups, contain conditions, calibrations, and other derived data used by the ATLAS software. In ATLAS this data has, thus far for historical reasons, been collected and accessed...
Accurate simulation of calorimeter response for high energy electromagnetic
particles is essential for the LHC experiments. Detailed simulation of the
electromagnetic showers using Geant4 is however very CPU intensive and
various fast simulation methods were proposed instead. The frozen shower
simulation substitutes the full propagation of the showers for energies
below $1$~GeV by showers...
The current tier-0 processing at CERN is done on two managed sites, the CERN computer centre and the Wigner computer centre. With the proliferation of public cloud resources at increasingly competitive prices, we have been investigating how to transparently increase our compute capacity to include these providers. The approach taken has been to integrate these resources using our existing...
Throughout the last decade the Open Science Grid (OSG) has been fielding requests from user communities, resource owners, and funding agencies to provide information about utilization of OSG resources. Requested data include traditional “accounting” - core-hours utilized - as well as user’s certificate Distinguished Name, their affiliations, and field of science. The OSG accounting service,...
It is well known that submitting jobs to the grid and transferring the
resulting data are not trivial tasks, especially when users are required
to manage their own X.509 certificates. Asking users to manage their
own certificates means that they need to keep the certificates secure,
remember to renew them periodically, frequently create proxy
certificates, and make them available to...
grid-control is an open source job submission tool that supports common HEP workflows.
Since 2007 it has been used by a number of HEP analyses to process tasks which routinely reach the order of tens of thousands of jobs.
The tool is very easy to deploy, either from its repository or the python package index (pypi). The project aims at being lightweight and portable. It can run in...
The Belle II experiment at the SuperKEKB e+e- accelerator is preparing for taking first collision data next year. For the success of the experiment it is essential to have information about varying conditions available in the simulation, reconstruction, and analysis code.
The interface to the conditions data in the client code was designed to make the life for developers as easy as possible....
Argonne provides a broad portfolio of computing resources to researchers. Since 2011 we have been providing a cloud computing resource to researchers, primarily using Openstack. Over the last year we’ve been working to better support containers in the context of HPC. Several of our operating environments now leverage a combination of the three technologies which provides infrastructure...
The Scientific Computing Department of the STFC runs a cloud service for internal users and various user communities. The SCD Cloud is configured using a Configuration Management System called Aquilon. Many of the virtual machine images are also created/configured using Aquilon. These are not unusual however our Integrations also allow Aquilon to be altered by the Cloud. For instance creation...
IPv4 network addresses are running out and the deployment of IPv6 networking in many places is now well underway. Following the work of the HEPiX IPv6 Working Group, a growing number of sites in the Worldwide Large Hadron Collider Computing Grid (WLCG) have deployed dual-stack IPv6/IPv4 services. The aim of this is to support the use of IPv6-only clients, i.e. worker nodes, virtual machines or...
Hybrid systems are emerging as an efficient solution in the HPC arena, with an abundance of approaches for integration of accelerators into the system (i.e. GPU, FPGA). In this context, one of the most important features is the chance of being able to address the accelerators, whether they be local or off-node, on an equal footing. Correct balancing and high performance in how the network...
We present an overview of Data Processing and Data Quality (DQ) Monitoring for the ATLAS Tile Hadronic
Calorimeter. Calibration runs are monitored from a data quality perspective and used as a cross-check for physics
runs. Data quality in physics runs is monitored extensively and continuously. Any problems are reported and
immediately investigated. The DQ efficiency achieved was 99.6% in 2012...
The SDN Next Generation Integrated Architecture (SDN-NGeNIA) program addresses some of the key challenges facing the present and next generations of science programs in HEP, astrophysics, and other fields whose potential discoveries depend on their ability to distribute, process and analyze globally distributed petascale to exascale datasets.
The SDN-NGenIA system under development by the...
A large part of the programs of hadron physics experiments deal with the search for new conventional and exotic hadronic states like e.g. hybrids and glueballs. In a majority of analyses a Partial Wave Analysis (PWA) is needed to identify possible exotic states and to classifiy known states. Of special interest is the comparison or combination of data from multiple experiments. Therefore, a...
This paper describes GridPP's Vacuum Platform for managing virtual machines (VMs), which has been used to run production workloads for WLCG, other HEP experiments, and some astronomy projects. The platform provides a uniform interface between VMs and the sites they run at, whether the site is organised as an Infrastructure-as-a-Service cloud system such as OpenStack with a push model, or an...
Over the past several years, rapid growth of data has affected many fields of science. This has often resulted in the need for overhauling or exchanging the tools and approaches in the disciplines’ data life cycles, allowing the application of new data analysis methods and facilitating improved data sharing.
The project Large-Scale Data Management and Analysis (LSDMA) of the German Helmholtz...
The LHCb collaboration is one of the four major experiments at the Large Hadron Collider at CERN. Petabytes of data are generated by the detectors and Monte-Carlo simulations. The LHCb Grid interware LHCbDIRAC is used to make data available to all collaboration members around the world. The data is replicated to the Grid sites in different locations. However, disk storage on the Grid is...
Performing efficient resource provisioning is a fundamental aspect for any resource provider. Local Resource Management Systems (LRMS) have been used in data centers for decades in order to obtain the best usage of the resources, providing their fair usage and partitioning for the users. In contrast, current cloud schedulers are normally based on the immediate allocation of resources on a...
Limits on power dissipation have pushed CPUs to grow in parallel processing capabilities rather than clock rate, leading to the rise of "manycore" or GPU-like processors. In order to achieve the best performance, applications must be able to take full advantage of vector units across multiple cores, or some analogous arrangement on an accelerator card. Such parallel performance is becoming a...
The Cherenkov Telescope Array (CTA) – an array of many tens of Imaging Atmospheric Cherenkov Telescopes deployed on an unprecedented scale – is the next-generation instrument in the field of very high energy gamma-ray astronomy. An average data stream of about 0.9 GB/s for about 1300 hours of observation per year is expected, therefore resulting in 4 PB of raw data per year and a total of 27...
The INFN’s project KM3NeT-Italy, supported with Italian PON (National Operative Programs) fundings, has designed a distributed Cherenkov neutrino telescope for collecting photons emitted along the path of the charged particles produced in neutrino interactions. The detector consists of 8 vertical structures, called towers, instrumented with a total number of 672 Optical Modules (OMs) and its...
The reconstruction of charged particles trajectories is a crucial task for most particle physics
experiments. The high instantaneous luminosity achieved at the LHC leads to a high number
of proton-proton collisions per bunch crossing, which has put the track reconstruction
software of the LHC experiments through a thorough test. Preserving track reconstruction
performance under...
Axion is a candidate of dark matter and is believed to be a breakthrough of strong CP problem in QCD [1]. CULTASK (CAPP Ultra-Low Temperature Axion Search in Korea) experiment is an axion search experiment which is being performed at Center for Axion and Precision Physics Research (CAPP), Institute for Basic Science (IBS) in Korea. Based on Sikivie’s method [2], CULTASK uses a resonant cavity...
More than one thousand physicists analyse data collected by the ATLAS experiment at the Large Hadron Collider (LHC) at CERN through 150 computing facilities around the world. Efficient distributed analysis requires optimal resource usage and the interplay of several
factors: robust grid and software infrastructures, and system capability to adapt to different workloads. The continuous...
Over the last seven years the software stack of the next generation B factory experiment Belle II has grown to over 400,000 lines of C++ and python code, counting only the part included in offline software releases. There are several thousand commits to the central repository by about 100 individual developers per year. To keep a coherent software stack of high quality such that it can be...
SWAN is a novel service to perform interactive data analysis in the cloud. SWAN allows users to write and run their data analyses with only a web browser, leveraging the widely-adopted Jupyter notebook interface. The user code, executions and data live entirely in the cloud. SWAN makes it easier to produce and share results and scientific code, access scientific software, produce tutorials and...
Open City Platform (OCP) is an industrial research project funded by the Italian Ministry of University and Research, started in 2014. It intends to research, develop and test new technological solutions open, interoperable and usable on-demand in the field of Cloud Computing, along with new sustainable organizational models for the public administration, to innovate, with scientific results,...
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads.
We present a generic analysis design pattern...
As a new approach to manage resource, virtualization technology is more and more widely applied in high-energy physics field. A virtual computing cluster based on Openstack was built at IHEP, and with HTCondor as the job queue management system. An accounting system which can record the resource usages of different experiment groups in details was also developed. There are two types of the...
The Deep Underground Neutrino Experiment (DUNE) will employ a uniquely large (40kt) Liquid Argon Time Projection chamber as the main component of its Far Detector. In order to validate this design and characterize the detector performance an ambitious experimental program (called "protoDUNE") has been created which includes a beam test of a large-scale DUNE prototype at CERN. The amount of...
With the LHC Run2, end user analyses are increasingly challenging for both users and resource providers.
On the one hand, boosted data rates and more complex analyses favor and require larger data volumes to be processed.
On the other hand, efficient analyses and resource provisioning require fast turnaround cycles.
This puts the scalability of analysis infrastructures to new...
The reconstruction and identification of charmed hadron decays provides an important tool for the study of heavy quark behavior in the Quark Gluon Plasma. Such measurements require high resolution to topologically identify decay daughters at vertices displaced <100 microns from the primary collision vertex, placing stringent demands on track reconstruction software. To enable these...
The LArIAT Liquid Argon Time Projection Chamber (TPC) in a Test Beam experiment explores the interaction of charged particles such as pions, kaons, electrons, muons and protons within the active liquid argon volume of the TPC detector. The LArIAT experiment started data collection at the Fermilab Test Beam Facility (FTBF) in April 2015 and continues to run in 2016. LArIAT provides important...
The VecGeom geometry library is a relatively recent effort aiming to provide
a modern and high performance geometry service for particle-detector simulation
in hierarchical detector geometries common to HEP experiments.
One of its principal targets is the effective use of vector SIMD hardware
instructions to accelerate geometry calculations for single-track as well
as multiple-track...
Over the past few years, Grid Computing technologies have reached a high
level of maturity. One key aspect of this success has been the development and adoption of newer Compute Elements to interface the external Grid users with local batch systems. These new Compute Elements allow for better handling of jobs requirements and a more precise management of diverse local resources.
However,...
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the Geographic South Pole. IceCube collects 1 TB of data every day. An online filtering farm processes this data in real time and selects 10% to be sent via satellite to the main data center at the University of Wisconsin-Madison. IceCube has two year-round on-site operators. New operators are hired every year,...
One of the large challenges of future particle physics experiments is the trend to run without a first level hardware trigger. The typical data rates exceed easily hundreds of GBytes/s, which is way too much to be stored permanently for an offline analysis. Therefore a strong data reduction has to be done by selection only those data, which is physically interesting. This implies that all...
Clouds and Virtualization are typically used in computing centers to satisfy diverse needs: different operating systems, software releases or fast servers/services delivery. On the other hand solutions relying on Linux kernel capabilities such as Docker are well suited for applications isolation and software developing. In our previous work (Docker experience at INFN-Pisa Grid Data Center*) we...
ATLAS track reconstruction code is continuously evolving to match the demands from the increasing instantaneous luminosity of LHC, as well as the increased centre-of-mass energy. With the increase in energy, events with dense environments, e.g. the cores of jets or boosted tau leptons, become much more abundant. These environments are characterised by charged particle separations on the order...
The Toolkit for Multivariate Analysis (TMVA) is a component of the ROOT data analysis framework and is widely used for classification problems. For example, TMVA might be used for the binary classification problem of distinguishing signal from background events.
The classification methods included in TMVA are standard, well-known machine learning techniques which can be implemented in other...
One of the STAR experiment's modular Messaging Interface and Reliable Architecture framework (MIRA) integration goals is to provide seamless and automatic connections with the existing control systems. After an initial proof of concept and operation of the MIRA system as a parallel data collection system for online use and real-time monitoring, the STAR Software and Computing group is now...
The Cloud Area Padovana has been running for almost two years. This is an OpenStack-based scientific cloud, spread across two different sites: the INFN Padova Unit and the INFN Legnaro National Labs.
The hardware resources have been scaled horizontally and vertically, by upgrading some hypervisors and by adding new ones: currently it provides about 1100 cores.
Some in-house developments were...
Containers remain a hot topic in computing, with new use cases and tools appearing every day. Basic functionality such as spawning containers seems to have settled, but topics like volume support or networking are still evolving. Solutions like Docker Swarm, Kubernetes or Mesos provide similar functionality but target different use cases, exposing distinct interfaces and APIs.
The CERN...
This contribution reports on solutions, experiences and recent developments with the dynamic, on-demand provisioning of remote computing resources for analysis and simulation workflows. Local resources of a physics institute are extended by private and commercial cloud sites, ranging from the inclusion of desktop clusters over institute clusters to HPC centers.
Rather than relying on...
Gravitational wave (GW) events can have several possible progenitors, including binary black hole mergers, cosmic string cusps, core-collapse supernovae, black hole-neutron star mergers, and neutron star-neutron star mergers. The latter three are expected to produce an electromagnetic signature that would be detectable by optical and infrared
telescopes. To that end, the LIGO-Virgo...
We investigate the combination of a Monte Carlo Tree Search, hierarchical space decomposition, Hough Transform techniques and
parallel computing to the problem of line detection and shape recognition in general.
Paul Hough introduced in 1962 a method for detecting lines in binary images. Extended in the 1970s to the detection of space forms, what
came to be known as the Hough Transform...
The all-silicon design of the tracking system of the CMS experiment provides excellent resolution for charged tracks and an efficient tagging of jets. As the CMS tracker, and in particular its pixel detector, underwent repairs and experienced changed conditions with the start of the LHC Run-II in 2015, the position and orientation of each of the 15148 silicon strip and 1440 silicon pixel...
In order to face the LHC luminosity increase planned for the next years, new high-throughput network mechanisms interfacing the detectors readout to the software trigger computing nodes are being developed in several CERN experiments.
Adopting many-core computing architectures such as Graphics Processing Units (GPUs) or the Many Integrated Core (MIC) would allow to reduce drastically the size...
The development of scientific computing is increasingly moving to web and mobile applications. All these clients need high-quality implementations of accessing heterogeneous computing resources provided by clusters, grid computing or cloud computing. We present a web service called SCEAPI and describe how it can abstract away many details and complexities involved in the use of scientific...
With the imminent upgrades to the LHC and the consequent increase of the amount and complexity of data collected by the experiments, CERN's computing infrastructures will be facing a large and challenging demand of computing resources. Within this scope, the adoption of cloud computing at CERN has been evaluated and has opened the doors for procuring external cloud services from providers,...
Events visualisation in ALICE - current status and strategy for Run 3
Jeremi Niedziela for the ALICE Collaboration
A Large Ion Collider Experiment (ALICE) is one of the four big experiments running at the Large Hadron Collider (LHC), which focuses on the study of the Quark-Gluon Plasma (QGP) being produced in heavy-ion collisions.
The ALICE Event Visualisation Environment (AliEVE) is...
General purpose Graphics Processor Units (GPGPU) are being evaluated for possible future inclusion in an upgraded ATLAS High Level Trigger farm. We have developed a demonstrator including GPGPU implementations of Inner Detector and Muon tracking and Calorimeter clustering within the ATLAS software framework. ATLAS is a general purpose particle physics experiment located on the LHC collider at...
The exponentially increasing need for high speed data transfer is driven by big data, cloud computing together with the needs of data intensive science, High Performance Computing (HPC), defense, the oil and gas industry etc. We report on the Zettar ZX software that has been developed since 2013 to meet these growing needs by providing high performance data transfer and encryption in a...
DAMPE is a powerful space telescope launched in December 2015, able to detect electrons and photons in a wide range of energy (5 GeV to 10 TeV) and with unprecedented energy resolution. Silicon tracker is a crucial component of detector, able to determine the direction of detected particles and trace the origin of incoming gamma rays. This contribution covers the reconstruction software of...
In this presentation, the data preparation workflows for Run 2 are
presented. Online data quality uses a new hybrid software release
that incorporates the latest offline data quality monitoring software
for the online environment. This is used to provide fast feedback in
the control room during a data acquisition (DAQ) run, via a
histogram-based monitoring framework as well as the online...
ALICE (A Large Heavy Ion Experiment) is one of the four major experiments at the Large Hadron Collider (LHC) at CERN.
The High Level Trigger (HLT) is an online compute farm which reconstructs events measured by the ALICE detector in real-time.
The most compute-intensive part is the reconstruction of particle trajectories called tracking and the most important detector for tracking is the...
INDIGO-DataCloud (INDIGO for short, https://www.indigo-datacloud.eu) is a project started in April 2015, funded under the EC Horizon 2020 framework program. It includes 26 European partners located in 11 countries and addresses the challenge of developing open source software, deployable in the form of a data/computing platform, aimed to scientific communities and designed to be deployed on...
The CERN Computer Security Team is assisting teams and individuals at CERN who want to address security concerns related to their computing endeavours. For projects in the early stages, we help incorporate security in system architecture and design. For software that is already implemented, we do penetration testing. For particularly sensitive components, we perform code reviews. Finally, for...
In 2019 the Large Hadron Collider will undergo upgrades in order to increase the luminosity by a factor two if compared to today's nominal luminosity. Currently CMS software parallelization strategy is oriented at scheduling one event per thread. However tracking timing performance depends from the factorial of the pileup leading the current approach to increase latency. When designing a HEP...
At the beginning, HEP experiments made use of photographical images both to record and store experimental data and to illustrate their findings. Then the experiments evolved and needed to find ways to visualize their data. With the availability of computer graphics, software packages to display event data and the detector geometry started to be developed. Here a brief history of event displays...
The main goal of the project to demonstrate the ability of using HTTP data
federations in a manner analogous to today.s AAA infrastructure used from
the CMS experiment. An initial testbed at Caltech has been built and
changes in the CMS software (CMSSW) are being implemented in order to
improve HTTP support. A set of machines is already set up at the Caltech
Tier2 in order to improve the...
The INDIGO-DataCloud project's ultimate goal is to provide a sustainable European software infrastructure for science, spanning multiple computer centers and existing public clouds.
The participating sites form a set of heterogeneous infrastructures, some running OpenNebula, some running OpenStack. There was the need to find a common denominator for the deployment of both the required PaaS...
Modern web browsers are powerful and sophisticated applications that support an ever-wider range of uses. One such use is rendering high-quality, GPU-accelerated, interactive 2D and 3D graphics in an HTML canvas. This can be done via WebGL, a JavaScript API based on OpenGL ES. Applications delivered via the browser have several distinct benefits for the developer and user. For example, they...
Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically come with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The...
For over a decade, X509 Proxy Certificates are used in High Energy Physics (HEP) to authenticate users and guarantee their membership in Virtual Organizations, on which subsequent authorization, e.g. for data access, is based upon. Although the established infrastructure worked well and provided sufficient security, the implementation of procedures and the underlying software is often seen as...
The increase in instantaneous luminosity, number of interactions per bunch crossing and detector granularity will pose an interesting challenge for the event reconstruction and the High Level Trigger system in the CMS experiment at the High Luminosity LHC (HL-LHC), as the amount of information to be handled will increase by 2 orders of magnitude. In order to reconstruct the Calorimetric...
In the competitive 'market' for large-scale storage solutions, EOS has been showing its excellence in the multi-Petabyte high-concurrency regime. It has also shown a disruptive potential in powering the CERNBox service in providing sync&share capabilities and in supporting innovative analysis environments along the storage of LHC data. EOS has also generated interest as generic storage...
JUNO (Jiangmen Underground Neutrino Observatory) is a multi-purpose neutrino experiment designed to measure the neutrino mass hierarchy and mixing parameters. JUNO is estimated to be in operation in 2019 with 2PB/year raw data rate. The IHEP computing center plans to build up virtualization infrastructure to manage computing resources in the coming years and JUNO is selected to be one of the...
Efficient and precise reconstruction of the primary vertex in
an LHC collision is essential in both the reconstruction of the full
kinematic properties of a hard-scatter event and of soft interactions as a
measure of the amount of pile-up. The reconstruction of primary vertices in
the busy, high pile-up environment of Run-2 of the LHC is a challenging
task. New methods have been developed by...
In view of Run3 (2020) the LHCb experiment is planning a major upgrade to fully readout events at 40 MHz collision rate. This in order to highly increase the statistic of the collected samples and go further in precision beyond Run2. An unprecedented amount of data will be produced, which will be fully reconstructed real-time to perform fast selection and categorization of interesting events....
ParaView [1] is a high performance visualization application not widely used in HEP. It is a long standing open source project led by Kitware[2] and involves several DOE and DOD laboratories and has been adopted by many DOE supercomputing centers and other sites. ParaView is unique in speed and efficiency by using state-of-the-art techniques developed by the academic visualization community...
When first looking at converting a part of our site’s grid infrastructure into a cloud based system in late 2013 we needed to ensure the continued accessibility of all of our resources during a potentially lengthy transition period.
Moving a limited number of nodes to the cloud proved ineffective as users expected a significant number of cloud resources to be available to justify the effort...
Randomly restoring files from tapes degrades the read performance primarily due to frequent tape mounts. The high latency and time-consuming tape mount and dismount is a major issue when accessing massive amounts of data from tape storage. BNL's mass storage system currently holds more than 80 PB of data on tapes, managed by HPSS. To restore files from HPSS, we make use of a scheduler...
Reproducibility is a fundamental piece of the scientific method and increasingly complex problems demand ever wider collaboration between scientists. To make research fully reproducible and accessible to collaborators a researcher has to take care of several aspects: research protocol description, data access, preservation of the execution environment, workflow pipeline, and analysis script...
This talk will present the result of recent developments to support new users from the Large Scale Survey Telescope (LSST) group on the GridPP DIRAC instance. I will describe a workflow used for galaxy shape identification analyses whilst highlighting specific challenges as well as the solutions currently being explored. The result of this work allows this community to make best use of...
The 2020 upgrade of the LHCb detector will vastly increase the rate of collisions the Online system needs to process in software, in order to filter events in real time. 30 million collisions per second will pass through a selection chain, where each step is executed conditional to its prior acceptance.
The Kalman Filter is a fit applied to all reconstructed tracks which, due to its time...
The LHCb detector at the LHC is a general purpose detector in the forward region with a focus on reconstructing decays of c- and b-hadrons. For Run II of the LHC, a new trigger strategy with a real-time reconstruction, alignment and calibration was developed and employed. This was made possible by implementing an offline-like track reconstruction in the high level trigger. However, the ever...
The Pacific Research Platform is an initiative to interconnect Science DMZs between campuses across the West Coast of the United States over a 100 gbps network. The LHC @ UC is a proof of concept pilot project that focuses on interconnecting 6 University of California campuses. It is spearheaded by computing specialists from the UCSD Tier 2 Center in collaboration with the San Diego...
The Durham High Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics. It is comprised of data points from plots and tables underlying over eight thousand publications, some of which are from the Large Hadron Collider (LHC) at CERN.
HEPData has been rewritten from the ground up...
The CRAYFIS experiment proposes usage of private mobile phones as a ground detector for Ultra High Energy Cosmic Rays. Interacting with Earth's atmosphere they produce extensive particle showers which can be detected by cameras on mobile phones. A typical shower contains minimally-ionizing particles such as muons. As they interact with CMOS detector they leave low-energy tracks that sometimes...
Experimental Particle Physics has been at the forefront of analyzing the world’s largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called “Big Data” technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles...
Precise modelling of detectors in simulations is the key to the understanding of their performance, which, in turn, is a prerequisite for the proper design choice and, later, for the achievement of valid physics results. In this report,
we describe the implementation of the Silicon Tracking System (STS), the main tracking device of the CBM experiment, in the CBM software environment. The STS...
Reproducibility is an essential component of the scientific process.
It is often necessary to check whether multiple runs of the same software
produce the same result. This may be done to validate whether a new machine
produces correct results on old software, whether new software produces
correct results on an old machine, or to compare the equality of two different approaches to the...
We describe the development and deployment of a distributed campus computing infrastructure consisting of a single job submission portal linked to multiple local campus resources, as well the wider computational fabric of the Open Science Grid (OSG). Campus resources consist of existing OSG-enabled clusters and clusters with no previous interface to the OSG. Users accessing the single...
European Strategy for Particle Physics update 2013, the study explores different designs of circular colliders for the post-LHC era. Reaching unprecedented energies and luminosities require to understand system reliability behaviour from the concept phase onwards and to design for availability and sustainable operation. The study explores industrial approaches to model and simulate the...
Global Science experimental Data hub Center (GSDC) at Korea Institute of Science and Technology Information (KISTI) located at Daejeon in South Korea is the unique data center in the country which helps with its computing resources fundamental research fields deal with the large-scale of data. For historical reason, it has run Torque batch system while recently it starts running HTCondor for...
Moore’s Law has defied our expectations and remained relevant in the semiconductor industry in the past 50 years, but many believe it is only a matter of time before an insurmountable technical barrier brings about its eventual demise. Many in the computing industry are now developing post-Moore’s Law processing solutions based on new and novel architectures. An example is the Micron...
The Large Hadron Collider beauty (LHCb) experiment at CERN specializes in investigating the slight differences between matter and antimatter by studying the decays of beauty or bottom (B) and charm (D) hadrons. The detector has been recording data from proton-proton collisions since 2010. Data preservation (DP) project at the LHCb insures preservation of the experimental and simulated (Monte...
High-energy particle physics (HEP) has advanced greatly over recent years and current plans for the future foresee even more ambitious targets and challenges that have to be coped with. Amongst the many computer technology R&D areas, simulation of particle detectors stands out as the most time consuming part of HEP computing. An intensive R&D and programming effort is required to exploit the...
The CMS experiment has implemented a computing model where distributed monitoring infrastructures are collecting any kind of data and metadata about the performance of the computing operations. This data can be probed further by harnessing Big Data analytics approaches and discovering patterns and correlations that can improve the throughput and the efficiency of the computing model.
CMS...
ALICE (A Large Ion Collider Experiment) is a detector system
optimized for the study of heavy ion collision detector at the
CERN LHC. The ALICE High Level Trigger (HLT) is a computing
cluster dedicated to the online reconstruction, analysis and
compression of experimental data. The High-Level Trigger receives
detector data via serial optical links into custom PCI-Express
based FPGA...
The Belle II experiment will generate very large data samples. In order to reduce the time for data analyses, loose selection criteria will be used to create files rich in samples of particular interest for a specific data analysis (data skims). Even so, many of the resultant skims will be very large, particularly for highly inclusive analyses. The Belle II collaboration is investigating the...
HEP software today is a rich and diverse domain in itself and exists within the mushrooming world of open source software. As HEP software developers and users we can be more productive and effective if our work and our choices are informed by a good knowledge of what others in our community have created or found useful. The HEP Software and Computing Knowledge Base, [hepsoftware.org][1], was...