CHEP 2016 Conference, San Francisco, October 8-14, 2016

Name: CHEP 2016 Conference, San Francisco, October 8-14, 2016
Start: 2016-10-10T08:00:00-07:00
End: 2016-10-14T18:00:00-07:00
Location: San Francisco Marriott Marquis

10–14 Oct 2016

San Francisco Marriott Marquis

America/Los_Angeles timezone

Session

Track 6: Infrastructures

10 Oct 2016, 11:00

Sierra B (San Francisco Mariott Marquis)

Sierra B

San Francisco Mariott Marquis

Track 6: Infrastructures: 6.1

Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

Track 6: Infrastructures: 6.2

Olof Barring (CERN)

Track 6: Infrastructures: 6.3

Francesco Prelz (Università degli Studi e INFN Milano (IT))

Track 6: Infrastructures: 6.4

Francesco Prelz (Università degli Studi e INFN Milano (IT))

Track 6: Infrastructures: 6.5

Olof Barring (CERN)

Track 6: Infrastructures: 6.6

Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

Track 6: Infrastructures: 6.7

Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

There are no materials yet.

88. ATLAS and LHC computing on CRAY

Francesco Giovanni Sciacca (Universitaet Bern (CH))

10/10/2016, 11:00

Track 6: Infrastructures

Oral

Access and exploitation of large scale computing resources, such as those offered by general
purpose HPC centres, is one import measure for ATLAS and the other Large Hadron Collider experiments
in order to meet the challenge posed by the full exploitation of the future data within the constraints of flat budgets.
We report on the effort moving the Swiss WLCG T2 computing,
serving ATLAS, CMS...

198. Integration of the Chinese HPC Grid in ATLAS Distributed Computing

Andrej Filipcic (Jozef Stefan Institute (SI))

10/10/2016, 11:15

Track 6: Infrastructures

Oral

Fifteen Chinese High Performance Computing sites, many of them on the TOP500 list of most powerful supercomputers, are integrated into a common infrastructure providing coherent access to a user through an interface based on a RESTful interface called SCEAPI. These resources have been integrated into the ATLAS Grid production system using a bridge between ATLAS and SCEAPI which translates the...

232. OCCAM: a flexible, multi-purpose and extendable HPC cluster

Stefano Bagnasco (I.N.F.N. TORINO)

10/10/2016, 11:30

Track 6: Infrastructures

Oral

Obtaining CPU cycles on an HPC cluster is nowadays relatively simple and sometimes even cheap for academic institutions. However, in most of the cases providers of HPC services would not allow changes on the configuration, implementation of special features or a lower-level control on the computing infrastructure and networks, for example for testing new computing patterns or conducting...

277. The OSG Open Facility: An on-ramp for opportunistic scientific computing

Dr Bo Jayatilaka (Fermi National Accelerator Lab. (US))

10/10/2016, 11:45

Track 6: Infrastructures

Oral

The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed...

313. ALICE HLT Cluster operation during ALICE Run 2

Johannes Lehrbach (Johann-Wolfgang-Goethe Univ. (DE))

10/10/2016, 12:00

Track 6: Infrastructures

Oral

ALICE HLT Cluster operation during ALICE Run 2

(Johannes Lehrbach) for the ALICE collaboration

ALICE (A Large Ion Collider Experiment) is one of the four major detectors located at the LHC at CERN, focusing on the study of heavy-ion collisions. The ALICE High Level Trigger (HLT) is a compute cluster which reconstructs the events and compresses the data in real-time. The data compression...

412. Dynamic resource provisioning of the CMS online cluster using a cloud overlay

Marc Dobson (CERN)

10/10/2016, 12:15

Track 6: Infrastructures

Oral

During the past years an increasing number of CMS computing resources are offered as clouds, bringing the flexibility of having virtualised compute resources and centralised management of the Virtual Machines (VMs). CMS has adapted its job submission infrastructure from a traditional Grid site to operation using a cloud service and meanwhile can run all types of offline workflows. The cloud...

74. The role of dedicated computing centers in the age of cloud computing

Tony Wong (Brookhaven National Laboratory)

10/10/2016, 14:00

Track 6: Infrastructures

Oral

Brookhvaven National Laboratory (BNL) anticipates significant growth in scientific programs with large computing and data storage needs in the near future and has recently re-organized support for scientific computing to meet these needs.
A key component is the enhanced role of the RHIC-ATLAS Computing Facility
(RACF)in support of high-throughput and high-performance computing (HTC and HPC) ...

165. Evaluation of lightweight site setups within WLCG infrastructure

Maarten Litmaath (CERN)

10/10/2016, 14:15

Track 6: Infrastructures

Oral

The Worldwide LHC Computing Grid (WLCG) infrastructure
allows the use of resources from more than 150 sites.
Until recently the setup of the resources and the middleware at a site
were typically dictated by the partner grid project (EGI, OSG, NorduGrid)
to which the site is affiliated.
Since a few years, however, changes in hardware, software, funding and
experiment computing requirements have...

248. Evolution of the Building Management System in the INFN CNAF Tier-1 datacenter facility

Pier Paolo Ricci (INFN CNAF)

10/10/2016, 14:30

Track 6: Infrastructures

Oral

The INFN CNAF Tier-1 computing center is composed by 2 different main rooms containing IT resources and 4 additional locations that hosts the necessary technology infrastructures providing the electrical power and refrigeration to the facility. The power supply and continuity are ensured by a dedicated room with three 15,000 to 400 V transformers in a separate part of the principal building...

512. Interconnected Private Clouds for Universities and Researchers

Jakub Moscicki (CERN)

10/10/2016, 14:45

Track 6: Infrastructures

Oral

1. Statement

OpenCloudMesh has a very simple goal: to be an open and vendor agnostic standard for private cloud interoperability.
To address the YetAnotherDataSilo problem, a working group under the umbrella of the GÉANT Association is has been created with the goal of ensuring neutrality and a clear context for this project.

All leading partners of the OpenCloudMesh project - GÉANT,...

421. Extending the farm on external sites: the INFN Tier-1 experience

Luca dell'Agnello (INFN-CNAF)

10/10/2016, 15:00

Track 6: Infrastructures

Oral

The Tier-1 at CNAF is the main INFN computing facility offering computing and storage resources to more than 30 different scientific collaborations including the 4 experiments at the LHC. It is also foreseen a huge increase in computing needs in the following years mainly driven by the experiments at the LHC (especially starting with the run 3 from 2021) but also by other upcoming experiments...

461. Developing the WLCG Tier-1 center GridKa as topical center in a multidisciplinary research environment

Andreas Heiss (KIT - Karlsruhe Institute of Technology (DE))

10/10/2016, 15:15

Track 6: Infrastructures

Oral

The WLCG Tier-1 center GridKa is developed and operated by the Steinbuch Centre for Computing (SCC)
at the Karlsruhe Institute of Technology (KIT). It was the origin of further Big Data research activities and
infrastructures at SCC, e.g. the Large Scale Data Facility (LSDF), providing petabyte scale data storage
for various non-HEP research communities.
Several ideas and plans...

480. System upgrade of the KEK central computing system

Koichi Murakami

10/10/2016, 15:30

Track 6: Infrastructures

Oral

The KEK central computer system (KEKCC) supports various activities in KEK, such as the Belle / Belle II, J-PARC experiments, etc. The system is now under replacement and will be put into production in September 2016. The computing resources, CPU and storage, in the next system are much enhanced as recent increase of computing resource demand. We will have 10,000 CPU cores, 13 PB disk storage,...

531. Using container orchestration to improve service management at the RAL Tier 1

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))

10/10/2016, 15:45

Track 6: Infrastructures

Oral

At the RAL Tier-1 we have been deploying production services on both bare metal and a variety of virtualisation platforms for many years. Despite the significant simplification of configuration and deployment of services due to the use of a configuration management system, maintaining services still requires a lot of effort. Also, the current approach of running services on static machines...

271. Review of Terabit/sec SDN demonstrations at Supercomputing 2015 and plans for SC16

Azher Mughal (California Institute of Technology (US))

11/10/2016, 11:15

Track 6: Infrastructures

Oral

The HEP prototypical systems at the Supercomputing conferences each year have served to illustrate the ongoing state of the art developments in high throughput, software-defined networked systems important for future data operations at the LHC and for other data intensive programs. The Supercomputing 2015 SDN demonstration revolved around an OpenFlow ring connecting 7 different booths and the...

291. Scaling the PuNDIT Project for Wide Area Deployments

Shawn Mc Kee (University of Michigan (US))

11/10/2016, 11:30

Track 6: Infrastructures

Oral

In today's world of distributed scientific collaborations, there are many challenges to providing reliable inter-domain network infrastructure. Network operators use a combination of
active monitoring and trouble tickets to detect problems, but these are often ineffective at identifying issues that impact wide-area network users. Additionally, these approaches do not scale to wide area...

417. One network metric datastore to track them all: The OSG Network Service

Robert Quick (Indiana University)

11/10/2016, 11:45

Track 6: Infrastructures

Oral

The Open Science Grid (OSG) relies upon the network as a critical part of the distributed infrastructures it enables. In 2012 OSG added a new focus area in networking with a goal of becoming the primary source of network information for its members and collaborators. This includes gathering, organizing and providing network metrics to guarantee effective network usage and prompt detection and...

471. Deployment of IPv6 only CPU resources at WLCG sites

Alastair Dewhurst (STFC - Rutherford Appleton Lab. (GB))

11/10/2016, 12:00

Track 6: Infrastructures

Oral

The fraction of internet traffic carried over IPv6 continues to grow rapidly. IPv6 support from network hardware vendors and carriers is pervasive and becoming mature. A network infrastructure upgrade often offers sites an excellent window of opportunity to configure and enable IPv6.

There is a significant overhead when setting up and maintaining dual stack machines, so where possible...

72. Wi-Fi service enhancement at CERN

Vincent Ducret (CERN)

11/10/2016, 12:15

Track 6: Infrastructures

Oral

Over the last few years, the number of mobile devices connected to the CERN internal network has increased from a handful in 2006 to more than 10,000 in 2015. Wireless access is no longer a “nice to have” or just for conference and meeting rooms, now support for mobility is expected by most, if not all, of the CERN community. In this context, a full renewal of the CERN Wi-Fi network has been...

29. RapidIO as a multi-purpose interconnect

Simaolhoda Baymani (CERN)

11/10/2016, 14:00

Track 6: Infrastructures

Oral

RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. Originally meant to be a front side bus, it developed into a system level interconnect which is today used in all 4G/LTE base stations world wide. RapidIO is often used in embedded systems that require high reliability, low latency and scalability in a...

488. High-Throughput Network Communication with NetIO

Jorn Schumacher (University of Paderborn (DE))

11/10/2016, 14:15

Track 6: Infrastructures

Oral

HPC network technologies like Infiniband, TrueScale or OmniPath provide low-
latency and high-throughput communication between hosts, which makes them
attractive options for data-acquisition systems in large-scale high-energy
physics experiments. Like HPC networks, data acquisition networks are local
and include a well specified number of systems. Unfortunately traditional...

270. Extreme I/O on HPC for HEP using the Burst Buffer at NERSC

Dr Wahid Bhimji (Lawrence Berkeley National Lab. (US))

11/10/2016, 14:30

Track 6: Infrastructures

Oral

In recent years there has been increasing use of HPC facilities for HEP experiments. This has initially focussed on less I/O intensive workloads such as generator-level or detector simulation. We now demonstrate the efficient running of I/O-heavy ‘analysis’ workloads for the ATLAS and ALICE collaborations on HPC facilities at NERSC, as well as astronomical image analysis for DESI.

To do...

282. AMS-02 Monte Carlo Simulation in Science Operation Center at Southeast University

Jinghui Zhang (Southeast University (CN))

11/10/2016, 14:45

Track 6: Infrastructures

Oral

Abstract: Southeast University Science Operation Center (SEUSOC) is one of the computing centers of the Alpha Magnetic Spectrometer (AMS-02) experiment. It provides 2000 CPU cores for AMS scientific computing and a dedicated 1Gbps Long Fat Network (LFN) for AMS data transmission between SEU and CERN. In this paper, the workflows of SEUSOC Monte Carlo (MC) production are discussed in...

325. Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project.

Piero Vicini (Universita e INFN, Roma I (IT))

11/10/2016, 15:00

Track 6: Infrastructures

Oral

With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended HPC’s reach from its roots in modeling and simulation of complex physical systems to a broad range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near...

524. Evaluation of the pre-production implementation of the Intel Omni-Path interconnect technology

Alexandr Zaytsev (Brookhaven National Laboratory (US))

11/10/2016, 15:15

Track 6: Infrastructures

Oral

This contribution gives a report on the remote evaluation of the pre-production Intel Omni-Path (OPA) interconnect hardware and software performed by RHIC & ATLAS Computing Facility (RACF) at BNL in Dec 2015 - Feb 2016 time period using a 32 node “Diamond” cluster with a single Omni-Path Host Fabric Interface (HFI) installed on each and a single 48-port Omni-Path switch with the non-blocking...

97. Advancing data management and analysis in different scientific disciplines

Max Fischer (KIT - Karlsruhe Institute of Technology (DE))

12/10/2016, 11:15

Track 6: Infrastructures

Oral

Over the past several years, rapid growth of data has affected many fields of science. This has often resulted in the need for overhauling or exchanging the tools and approaches in the disciplines’ data life cycles, allowing the application of new data analysis methods and facilitating improved data sharing.

The project Large-Scale Data Management and Analysis (LSDMA) of the German Helmholtz...

257. SWAN: a Service for Web-Based Data Analysis in the Cloud

Enric Tejedor Saavedra (CERN)

12/10/2016, 11:30

Track 6: Infrastructures

Oral

SWAN is a novel service to perform interactive data analysis in the cloud. SWAN allows users to write and run their data analyses with only a web browser, leveraging the widely-adopted Jupyter notebook interface. The user code, executions and data live entirely in the cloud. SWAN makes it easier to produce and share results and scientific code, access scientific software, produce tutorials and...

442. Cloud Environment Automation: from infrastructure deployment to application monitoring

Diego MICHELOTTO (INFN - CNAF)

12/10/2016, 11:45

Track 6: Infrastructures

Oral

Open City Platform (OCP) is an industrial research project funded by the Italian Ministry of University and Research, started in 2014. It intends to research, develop and test new technological solutions open, interoperable and usable on-demand in the field of Cloud Computing, along with new sustainable organizational models for the public administration, to innovate, with scientific results,...

266. Experiences with ALICE Mesos Infrastructure

Dario Berzano (CERN)

12/10/2016, 12:00

Track 6: Infrastructures

Oral

Apache Mesos is a resource management system for large data centres, initially developed by UC Berkeley, and now maintained under the Apache Foundation umbrella. It is widely used in the industry by companies like Apple, Twitter, and AirBnB and it's known to scale to 10'000s of nodes. Together with other tools of its ecosystem, like Mesosphere Marathon or Chronos, it provides an end-to-end...

323. Consolidation of Docker use in HTC and its evolution to HPC at INFN-Pisa

Enrico Mazzoni (INFN-Pisa)

12/10/2016, 12:15

Track 6: Infrastructures

Oral

Clouds and Virtualization are typically used in computing centers to satisfy diverse needs: different operating systems, software releases or fast servers/services delivery. On the other hand solutions relying on Linux kernel capabilities such as Docker are well suited for applications isolation and software developing. In our previous work (Docker experience at INFN-Pisa Grid Data Center*) we...

293. Using Shifter to Bring Containerized CVMFS to HPC

Lisa Gerhardt (LBNL)

12/10/2016, 12:30

Track 6: Infrastructures

Oral

Bringing HEP computing to HPC can be difficult. Software stacks are often very complicated with numerous dependencies that are difficult to get installed on an HPC system. To address this issue, amongst others, NERSC has created Shifter, a framework that delivers Docker-like functionality to HPC. It works by extracting images from native formats (such as a Docker image) and converting them to...

632. How far can COTS HPC go?

Dr Leng Tau (Supermicro)

12/10/2016, 12:45

Track 6: Infrastructures

Oral

COTS HPC has evolved for two decades to become an undeniable mainstream computing solution. It represents a major shift away from yesterday’s proprietary, vector-based processors and architectures to modern supercomputing clusters built on open industry standard hardware. This shift enabled the Industry with a cost-effective path to high-performance, scalable and flexible supercomputers (from...

26. CERN Computing in Commercial Clouds

Cristovao Cordeiro (CERN)

13/10/2016, 11:00

Track 6: Infrastructures

Oral

With the imminent upgrades to the LHC and the consequent increase of the amount and complexity of data collected by the experiments, CERN's computing infrastructures will be facing a large and challenging demand of computing resources. Within this scope, the adoption of cloud computing at CERN has been evaluated and has opened the doors for procuring external cloud services from providers,...

377. INDIGO-Datacloud: a Cloud-based Platform as a Service oriented to scientific computing for the exploitation of heterogeneous resources

Patrick Fuhrmann (Deutsches Elektronen-Synchrotron (DE))

13/10/2016, 11:15

Track 6: Infrastructures

Oral

INDIGO-DataCloud (INDIGO for short, https://www.indigo-datacloud.eu) is a project started in April 2015, funded under the EC Horizon 2020 framework program. It includes 26 European partners located in 11 countries and addresses the challenge of developing open source software, deployable in the form of a data/computing platform, aimed to scientific communities and designed to be deployed on...

431. TOSCA-based orchestration of complex clusters at the IaaS level

Ricardo Brito Da Rocha (CERN)

13/10/2016, 11:30

Track 6: Infrastructures

Oral

The INDIGO-DataCloud project's ultimate goal is to provide a sustainable European software infrastructure for science, spanning multiple computer centers and existing public clouds.
The participating sites form a set of heterogeneous infrastructures, some running OpenNebula, some running OpenStack. There was the need to find a common denominator for the deployment of both the required PaaS...

227. JUNO performance evaluation and optimization on virtual platform

Wenjing Wu (Computer Center, IHEP, CAS)

13/10/2016, 11:45

Track 6: Infrastructures

Oral

JUNO (Jiangmen Underground Neutrino Observatory) is a multi-purpose neutrino experiment designed to measure the neutrino mass hierarchy and mixing parameters. JUNO is estimated to be in operation in 2019 with 2PB/year raw data rate. The IHEP computing center plans to build up virtualization infrastructure to manage computing resources in the coming years and JUNO is selected to be one of the...

386. Stealth Cloud: How not to waste CPU during grid to cloud transitions

Daniela Bauer (Imperial College Sci., Tech. & Med. (GB))

13/10/2016, 12:00

Track 6: Infrastructures

Oral

When first looking at converting a part of our site’s grid infrastructure into a cloud based system in late 2013 we needed to ensure the continued accessibility of all of our resources during a potentially lengthy transition period.
Moving a limited number of nodes to the cloud proved ineffective as users expected a significant number of cloud resources to be available to justify the effort...

575. Efficient Access to Massive Amounts of Tape-Resident Data

David Yu (Brookhaven National Laboratory (US))

13/10/2016, 12:15

Track 6: Infrastructures

Oral

Randomly restoring files from tapes degrades the read performance primarily due to frequent tape mounts. The high latency and time-consuming tape mount and dismount is a major issue when accessing massive amounts of data from tape storage. BNL's mass storage system currently holds more than 80 PB of data on tapes, managed by HPSS. To restore files from HPSS, we make use of a scheduler...

269. Building a Regional Computing Grid for the University of California at 100gbps

Jeffrey Michael Dost (Univ. of California San Diego (US))

13/10/2016, 14:00

Track 6: Infrastructures

Oral

The Pacific Research Platform is an initiative to interconnect Science DMZs between campuses across the West Coast of the United States over a 100 gbps network. The LHC @ UC is a proof of concept pilot project that focuses on interconnecting 6 University of California campuses. It is spearheaded by computing specialists from the UCSD Tier 2 Center in collaboration with the San Diego...

276. An OSG-based distributed campus computing infrastructure

Christoph Paus (Massachusetts Inst. of Technology (US))

13/10/2016, 14:15

Track 6: Infrastructures

Oral

We describe the development and deployment of a distributed campus computing infrastructure consisting of a single job submission portal linked to multiple local campus resources, as well the wider computational fabric of the Open Science Grid (OSG). Campus resources consist of existing OSG-enabled clusters and clusters with no previous interface to the OSG. Users accessing the single...

370. Experience on HTCondor batch system for HEP and other research fields at KISTI-GSDC

Sang Un Ahn (KiSTi Korea Institute of Science & Technology Information (KR))

13/10/2016, 14:30

Track 6: Infrastructures

Oral

Global Science experimental Data hub Center (GSDC) at Korea Institute of Science and Technology Information (KISTI) located at Daejeon in South Korea is the unique data center in the country which helps with its computing resources fundamental research fields deal with the large-scale of data. For historical reason, it has run Torque batch system while recently it starts running HTCondor for...

179. Integration of grid and local batch system resources at DESY

Andreas Gellrich (DESY)

13/10/2016, 14:45

Track 6: Infrastructures

Oral

We present the consolidated batch system at DESY. As one of the largest resource centres DESY has to support differing work flows by HEP experiments in WLCG or Belle II as well as local users. By abandoning specific worker node setups in favour of generic flat nodes with middleware resources provided via CVMFS, we gain flexibility to subsume different use cases in a homogeneous environment. ...

225. Mixing HTC and HPC Workloads With HTCondor and Slurm

Christopher Hollowell (Brookhaven National Laboratory)

13/10/2016, 15:00

Track 6: Infrastructures

Oral

Traditionally, the RHIC/ATLAS Computing Facility (RACF) at Brookhaven National Laboratory has only maintained High Throughput Computing (HTC) resources for our HEP/NP user community. We've been using HTCondor as our batch system for many years, as this software is particularly well suited for managing HTC processor farm resources. Recently, the RACF has also begun to design/administrate some...

350. Benchmarking worker nodes using LHCb simulation productions and comparing with HEP-Spec06

Philippe Charpentier (CERN)

13/10/2016, 15:15

Track 6: Infrastructures

Oral

In order to estimate the capabilities of a Computing slot with limited processing time, it is necessary to know with a rather good precision its “power”. This allows for example pilot job to match a task for which the required CPU work is known, or to define the number of events to be processed knowing the CPU work per event. Otherwise one always has the risk that the task is aborted because...

622. Supermicro

Building timetable...

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Session

Track 6: Infrastructures

Sierra B

San Francisco Mariott Marquis

Conveners

Track 6: Infrastructures: 6.1

Track 6: Infrastructures: 6.2

Track 6: Infrastructures: 6.3

Track 6: Infrastructures: 6.4

Track 6: Infrastructures: 6.5

Track 6: Infrastructures: 6.6

Track 6: Infrastructures: 6.7

Presentation materials

Choose timezone

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Conveners

Track 6: Infrastructures: 6.1

Track 6: Infrastructures: 6.2

Track 6: Infrastructures: 6.3

Track 6: Infrastructures: 6.4

Track 6: Infrastructures: 6.5

Track 6: Infrastructures: 6.6

Track 6: Infrastructures: 6.7

Presentation materials