10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

Session

Track 6: Infrastructures

T6
10 Oct 2016, 11:00
Sierra B (San Francisco Mariott Marquis)

Sierra B

San Francisco Mariott Marquis

Conveners

Track 6: Infrastructures: 6.1

  • Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

Track 6: Infrastructures: 6.2

  • Olof Barring (CERN)

Track 6: Infrastructures: 6.3

  • Francesco Prelz (Università degli Studi e INFN Milano (IT))

Track 6: Infrastructures: 6.4

  • Francesco Prelz (Università degli Studi e INFN Milano (IT))

Track 6: Infrastructures: 6.5

  • Olof Barring (CERN)

Track 6: Infrastructures: 6.6

  • Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

Track 6: Infrastructures: 6.7

  • Catherine Biscarat (LPSC Grenoble, IN2P3/CNRS)

Presentation materials

There are no materials yet.

  1. Francesco Giovanni Sciacca (Universitaet Bern (CH))
    10/10/2016, 11:00
    Track 6: Infrastructures
    Oral

    Access and exploitation of large scale computing resources, such as those offered by general
    purpose HPC centres, is one import measure for ATLAS and the other Large Hadron Collider experiments
    in order to meet the challenge posed by the full exploitation of the future data within the constraints of flat budgets.
    We report on the effort moving the Swiss WLCG T2 computing,
    serving ATLAS, CMS...

    Go to contribution page
  2. Andrej Filipcic (Jozef Stefan Institute (SI))
    10/10/2016, 11:15
    Track 6: Infrastructures
    Oral

    Fifteen Chinese High Performance Computing sites, many of them on the TOP500 list of most powerful supercomputers, are integrated into a common infrastructure providing coherent access to a user through an interface based on a RESTful interface called SCEAPI. These resources have been integrated into the ATLAS Grid production system using a bridge between ATLAS and SCEAPI which translates the...

    Go to contribution page
  3. Stefano Bagnasco (I.N.F.N. TORINO)
    10/10/2016, 11:30
    Track 6: Infrastructures
    Oral

    Obtaining CPU cycles on an HPC cluster is nowadays relatively simple and sometimes even cheap for academic institutions. However, in most of the cases providers of HPC services would not allow changes on the configuration, implementation of special features or a lower-level control on the computing infrastructure and networks, for example for testing new computing patterns or conducting...

    Go to contribution page
  4. Dr Bo Jayatilaka (Fermi National Accelerator Lab. (US))
    10/10/2016, 11:45
    Track 6: Infrastructures
    Oral

    The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed...

    Go to contribution page
  5. Johannes Lehrbach (Johann-Wolfgang-Goethe Univ. (DE))
    10/10/2016, 12:00
    Track 6: Infrastructures
    Oral

    ALICE HLT Cluster operation during ALICE Run 2

    (Johannes Lehrbach) for the ALICE collaboration

    ALICE (A Large Ion Collider Experiment) is one of the four major detectors located at the LHC at CERN, focusing on the study of heavy-ion collisions. The ALICE High Level Trigger (HLT) is a compute cluster which reconstructs the events and compresses the data in real-time. The data compression...

    Go to contribution page
  6. Marc Dobson (CERN)
    10/10/2016, 12:15
    Track 6: Infrastructures
    Oral

    During the past years an increasing number of CMS computing resources are offered as clouds, bringing the flexibility of having virtualised compute resources and centralised management of the Virtual Machines (VMs). CMS has adapted its job submission infrastructure from a traditional Grid site to operation using a cloud service and meanwhile can run all types of offline workflows. The cloud...

    Go to contribution page
  7. Tony Wong (Brookhaven National Laboratory)
    10/10/2016, 14:00
    Track 6: Infrastructures
    Oral

    Brookhvaven National Laboratory (BNL) anticipates significant growth in scientific programs with large computing and data storage needs in the near future and has recently re-organized support for scientific computing to meet these needs.
    A key component is the enhanced role of the RHIC-ATLAS Computing Facility
    (RACF)in support of high-throughput and high-performance computing (HTC and HPC) ...

    Go to contribution page
  8. Maarten Litmaath (CERN)
    10/10/2016, 14:15
    Track 6: Infrastructures
    Oral

    The Worldwide LHC Computing Grid (WLCG) infrastructure
    allows the use of resources from more than 150 sites.
    Until recently the setup of the resources and the middleware at a site
    were typically dictated by the partner grid project (EGI, OSG, NorduGrid)
    to which the site is affiliated.
    Since a few years, however, changes in hardware, software, funding and
    experiment computing requirements have...

    Go to contribution page
  9. Pier Paolo Ricci (INFN CNAF)
    10/10/2016, 14:30
    Track 6: Infrastructures
    Oral

    The INFN CNAF Tier-1 computing center is composed by 2 different main rooms containing IT resources and 4 additional locations that hosts the necessary technology infrastructures providing the electrical power and refrigeration to the facility. The power supply and continuity are ensured by a dedicated room with three 15,000 to 400 V transformers in a separate part of the principal building...

    Go to contribution page
  10. Jakub Moscicki (CERN)
    10/10/2016, 14:45
    Track 6: Infrastructures
    Oral

    1. Statement

    OpenCloudMesh has a very simple goal: to be an open and vendor agnostic standard for private cloud interoperability.
    To address the YetAnotherDataSilo problem, a working group under the umbrella of the GÉANT Association is has been created with the goal of ensuring neutrality and a clear context for this project.

    All leading partners of the OpenCloudMesh project - GÉANT,...

    Go to contribution page
  11. Luca dell'Agnello (INFN-CNAF)
    10/10/2016, 15:00
    Track 6: Infrastructures
    Oral

    The Tier-1 at CNAF is the main INFN computing facility offering computing and storage resources to more than 30 different scientific collaborations including the 4 experiments at the LHC. It is also foreseen a huge increase in computing needs in the following years mainly driven by the experiments at the LHC (especially starting with the run 3 from 2021) but also by other upcoming experiments...

    Go to contribution page
  12. Andreas Heiss (KIT - Karlsruhe Institute of Technology (DE))
    10/10/2016, 15:15
    Track 6: Infrastructures
    Oral

    The WLCG Tier-1 center GridKa is developed and operated by the Steinbuch Centre for Computing (SCC)
    at the Karlsruhe Institute of Technology (KIT). It was the origin of further Big Data research activities and
    infrastructures at SCC, e.g. the Large Scale Data Facility (LSDF), providing petabyte scale data storage
    for various non-HEP research communities.
    Several ideas and plans...

    Go to contribution page
  13. Koichi Murakami
    10/10/2016, 15:30
    Track 6: Infrastructures
    Oral

    The KEK central computer system (KEKCC) supports various activities in KEK, such as the Belle / Belle II, J-PARC experiments, etc. The system is now under replacement and will be put into production in September 2016. The computing resources, CPU and storage, in the next system are much enhanced as recent increase of computing resource demand. We will have 10,000 CPU cores, 13 PB disk storage,...

    Go to contribution page
  14. Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))
    10/10/2016, 15:45
    Track 6: Infrastructures
    Oral

    At the RAL Tier-1 we have been deploying production services on both bare metal and a variety of virtualisation platforms for many years. Despite the significant simplification of configuration and deployment of services due to the use of a configuration management system, maintaining services still requires a lot of effort. Also, the current approach of running services on static machines...

    Go to contribution page
  15. Azher Mughal (California Institute of Technology (US))
    11/10/2016, 11:15
    Track 6: Infrastructures
    Oral

    The HEP prototypical systems at the Supercomputing conferences each year have served to illustrate the ongoing state of the art developments in high throughput, software-defined networked systems important for future data operations at the LHC and for other data intensive programs. The Supercomputing 2015 SDN demonstration revolved around an OpenFlow ring connecting 7 different booths and the...

    Go to contribution page
  16. Shawn Mc Kee (University of Michigan (US))
    11/10/2016, 11:30
    Track 6: Infrastructures
    Oral

    In today's world of distributed scientific collaborations, there are many challenges to providing reliable inter-domain network infrastructure. Network operators use a combination of
    active monitoring and trouble tickets to detect problems, but these are often ineffective at identifying issues that impact wide-area network users. Additionally, these approaches do not scale to wide area...

    Go to contribution page
  17. Robert Quick (Indiana University)
    11/10/2016, 11:45
    Track 6: Infrastructures
    Oral

    The Open Science Grid (OSG) relies upon the network as a critical part of the distributed infrastructures it enables. In 2012 OSG added a new focus area in networking with a goal of becoming the primary source of network information for its members and collaborators. This includes gathering, organizing and providing network metrics to guarantee effective network usage and prompt detection and...

    Go to contribution page
  18. Alastair Dewhurst (STFC - Rutherford Appleton Lab. (GB))
    11/10/2016, 12:00
    Track 6: Infrastructures
    Oral

    The fraction of internet traffic carried over IPv6 continues to grow rapidly. IPv6 support from network hardware vendors and carriers is pervasive and becoming mature. A network infrastructure upgrade often offers sites an excellent window of opportunity to configure and enable IPv6.

    There is a significant overhead when setting up and maintaining dual stack machines, so where possible...

    Go to contribution page
  19. Vincent Ducret (CERN)
    11/10/2016, 12:15
    Track 6: Infrastructures
    Oral

    Over the last few years, the number of mobile devices connected to the CERN internal network has increased from a handful in 2006 to more than 10,000 in 2015. Wireless access is no longer a “nice to have” or just for conference and meeting rooms, now support for mobility is expected by most, if not all, of the CERN community. In this context, a full renewal of the CERN Wi-Fi network has been...

    Go to contribution page
  20. Simaolhoda Baymani (CERN)
    11/10/2016, 14:00
    Track 6: Infrastructures
    Oral

    RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. Originally meant to be a front side bus, it developed into a system level interconnect which is today used in all 4G/LTE base stations world wide. RapidIO is often used in embedded systems that require high reliability, low latency and scalability in a...

    Go to contribution page
  21. Jorn Schumacher (University of Paderborn (DE))
    11/10/2016, 14:15
    Track 6: Infrastructures
    Oral

    HPC network technologies like Infiniband, TrueScale or OmniPath provide low-
    latency and high-throughput communication between hosts, which makes them
    attractive options for data-acquisition systems in large-scale high-energy
    physics experiments. Like HPC networks, data acquisition networks are local
    and include a well specified number of systems. Unfortunately traditional...

    Go to contribution page
  22. Dr Wahid Bhimji (Lawrence Berkeley National Lab. (US))
    11/10/2016, 14:30
    Track 6: Infrastructures
    Oral

    In recent years there has been increasing use of HPC facilities for HEP experiments. This has initially focussed on less I/O intensive workloads such as generator-level or detector simulation. We now demonstrate the efficient running of I/O-heavy ‘analysis’ workloads for the ATLAS and ALICE collaborations on HPC facilities at NERSC, as well as astronomical image analysis for DESI.

    To do...

    Go to contribution page
  23. Jinghui Zhang (Southeast University (CN))
    11/10/2016, 14:45
    Track 6: Infrastructures
    Oral

    Abstract: Southeast University Science Operation Center (SEUSOC) is one of the computing centers of the Alpha Magnetic Spectrometer (AMS-02) experiment. It provides 2000 CPU cores for AMS scientific computing and a dedicated 1Gbps Long Fat Network (LFN) for AMS data transmission between SEU and CERN. In this paper, the workflows of SEUSOC Monte Carlo (MC) production are discussed in...

    Go to contribution page
  24. Piero Vicini (Universita e INFN, Roma I (IT))
    11/10/2016, 15:00
    Track 6: Infrastructures
    Oral

    With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended HPC’s reach from its roots in modeling and simulation of complex physical systems to a broad range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near...

    Go to contribution page
  25. Alexandr Zaytsev (Brookhaven National Laboratory (US))
    11/10/2016, 15:15
    Track 6: Infrastructures
    Oral

    This contribution gives a report on the remote evaluation of the pre-production Intel Omni-Path (OPA) interconnect hardware and software performed by RHIC & ATLAS Computing Facility (RACF) at BNL in Dec 2015 - Feb 2016 time period using a 32 node “Diamond” cluster with a single Omni-Path Host Fabric Interface (HFI) installed on each and a single 48-port Omni-Path switch with the non-blocking...

    Go to contribution page
  26. Max Fischer (KIT - Karlsruhe Institute of Technology (DE))
    12/10/2016, 11:15
    Track 6: Infrastructures
    Oral

    Over the past several years, rapid growth of data has affected many fields of science. This has often resulted in the need for overhauling or exchanging the tools and approaches in the disciplines’ data life cycles, allowing the application of new data analysis methods and facilitating improved data sharing.

    The project Large-Scale Data Management and Analysis (LSDMA) of the German Helmholtz...

    Go to contribution page
  27. Enric Tejedor Saavedra (CERN)
    12/10/2016, 11:30
    Track 6: Infrastructures
    Oral

    SWAN is a novel service to perform interactive data analysis in the cloud. SWAN allows users to write and run their data analyses with only a web browser, leveraging the widely-adopted Jupyter notebook interface. The user code, executions and data live entirely in the cloud. SWAN makes it easier to produce and share results and scientific code, access scientific software, produce tutorials and...

    Go to contribution page
  28. Diego MICHELOTTO (INFN - CNAF)
    12/10/2016, 11:45
    Track 6: Infrastructures
    Oral

    Open City Platform (OCP) is an industrial research project funded by the Italian Ministry of University and Research, started in 2014. It intends to research, develop and test new technological solutions open, interoperable and usable on-demand in the field of Cloud Computing, along with new sustainable organizational models for the public administration, to innovate, with scientific results,...

    Go to contribution page
  29. Dario Berzano (CERN)
    12/10/2016, 12:00
    Track 6: Infrastructures
    Oral

    Apache Mesos is a resource management system for large data centres, initially developed by UC Berkeley, and now maintained under the Apache Foundation umbrella. It is widely used in the industry by companies like Apple, Twitter, and AirBnB and it's known to scale to 10'000s of nodes. Together with other tools of its ecosystem, like Mesosphere Marathon or Chronos, it provides an end-to-end...

    Go to contribution page
  30. Enrico Mazzoni (INFN-Pisa)
    12/10/2016, 12:15
    Track 6: Infrastructures
    Oral

    Clouds and Virtualization are typically used in computing centers to satisfy diverse needs: different operating systems, software releases or fast servers/services delivery. On the other hand solutions relying on Linux kernel capabilities such as Docker are well suited for applications isolation and software developing. In our previous work (Docker experience at INFN-Pisa Grid Data Center*) we...

    Go to contribution page
  31. Lisa Gerhardt (LBNL)
    12/10/2016, 12:30
    Track 6: Infrastructures
    Oral

    Bringing HEP computing to HPC can be difficult. Software stacks are often very complicated with numerous dependencies that are difficult to get installed on an HPC system. To address this issue, amongst others, NERSC has created Shifter, a framework that delivers Docker-like functionality to HPC. It works by extracting images from native formats (such as a Docker image) and converting them to...

    Go to contribution page
  32. Dr Leng Tau (Supermicro)
    12/10/2016, 12:45
    Track 6: Infrastructures
    Oral

    COTS HPC has evolved for two decades to become an undeniable mainstream computing solution. It represents a major shift away from yesterday’s proprietary, vector-based processors and architectures to modern supercomputing clusters built on open industry standard hardware. This shift enabled the Industry with a cost-effective path to high-performance, scalable and flexible supercomputers (from...

    Go to contribution page
  33. Cristovao Cordeiro (CERN)
    13/10/2016, 11:00
    Track 6: Infrastructures
    Oral

    With the imminent upgrades to the LHC and the consequent increase of the amount and complexity of data collected by the experiments, CERN's computing infrastructures will be facing a large and challenging demand of computing resources. Within this scope, the adoption of cloud computing at CERN has been evaluated and has opened the doors for procuring external cloud services from providers,...

    Go to contribution page
  34. Patrick Fuhrmann (Deutsches Elektronen-Synchrotron (DE))
    13/10/2016, 11:15
    Track 6: Infrastructures
    Oral

    INDIGO-DataCloud (INDIGO for short, https://www.indigo-datacloud.eu) is a project started in April 2015, funded under the EC Horizon 2020 framework program. It includes 26 European partners located in 11 countries and addresses the challenge of developing open source software, deployable in the form of a data/computing platform, aimed to scientific communities and designed to be deployed on...

    Go to contribution page
  35. Ricardo Brito Da Rocha (CERN)
    13/10/2016, 11:30
    Track 6: Infrastructures
    Oral

    The INDIGO-DataCloud project's ultimate goal is to provide a sustainable European software infrastructure for science, spanning multiple computer centers and existing public clouds.
    The participating sites form a set of heterogeneous infrastructures, some running OpenNebula, some running OpenStack. There was the need to find a common denominator for the deployment of both the required PaaS...

    Go to contribution page
  36. Wenjing Wu (Computer Center, IHEP, CAS)
    13/10/2016, 11:45
    Track 6: Infrastructures
    Oral

    JUNO (Jiangmen Underground Neutrino Observatory) is a multi-purpose neutrino experiment designed to measure the neutrino mass hierarchy and mixing parameters. JUNO is estimated to be in operation in 2019 with 2PB/year raw data rate. The IHEP computing center plans to build up virtualization infrastructure to manage computing resources in the coming years and JUNO is selected to be one of the...

    Go to contribution page
  37. Daniela Bauer (Imperial College Sci., Tech. & Med. (GB))
    13/10/2016, 12:00
    Track 6: Infrastructures
    Oral

    When first looking at converting a part of our site’s grid infrastructure into a cloud based system in late 2013 we needed to ensure the continued accessibility of all of our resources during a potentially lengthy transition period.
    Moving a limited number of nodes to the cloud proved ineffective as users expected a significant number of cloud resources to be available to justify the effort...

    Go to contribution page
  38. David Yu (Brookhaven National Laboratory (US))
    13/10/2016, 12:15
    Track 6: Infrastructures
    Oral

    Randomly restoring files from tapes degrades the read performance primarily due to frequent tape mounts. The high latency and time-consuming tape mount and dismount is a major issue when accessing massive amounts of data from tape storage. BNL's mass storage system currently holds more than 80 PB of data on tapes, managed by HPSS. To restore files from HPSS, we make use of a scheduler...

    Go to contribution page
  39. Jeffrey Michael Dost (Univ. of California San Diego (US))
    13/10/2016, 14:00
    Track 6: Infrastructures
    Oral

    The Pacific Research Platform is an initiative to interconnect Science DMZs between campuses across the West Coast of the United States over a 100 gbps network. The LHC @ UC is a proof of concept pilot project that focuses on interconnecting 6 University of California campuses. It is spearheaded by computing specialists from the UCSD Tier 2 Center in collaboration with the San Diego...

    Go to contribution page
  40. Christoph Paus (Massachusetts Inst. of Technology (US))
    13/10/2016, 14:15
    Track 6: Infrastructures
    Oral

    We describe the development and deployment of a distributed campus computing infrastructure consisting of a single job submission portal linked to multiple local campus resources, as well the wider computational fabric of the Open Science Grid (OSG). Campus resources consist of existing OSG-enabled clusters and clusters with no previous interface to the OSG. Users accessing the single...

    Go to contribution page
  41. Sang Un Ahn (KiSTi Korea Institute of Science & Technology Information (KR))
    13/10/2016, 14:30
    Track 6: Infrastructures
    Oral

    Global Science experimental Data hub Center (GSDC) at Korea Institute of Science and Technology Information (KISTI) located at Daejeon in South Korea is the unique data center in the country which helps with its computing resources fundamental research fields deal with the large-scale of data. For historical reason, it has run Torque batch system while recently it starts running HTCondor for...

    Go to contribution page
  42. Andreas Gellrich (DESY)
    13/10/2016, 14:45
    Track 6: Infrastructures
    Oral

    We present the consolidated batch system at DESY. As one of the largest resource centres DESY has to support differing work flows by HEP experiments in WLCG or Belle II as well as local users. By abandoning specific worker node setups in favour of generic flat nodes with middleware resources provided via CVMFS, we gain flexibility to subsume different use cases in a homogeneous environment. ...

    Go to contribution page
  43. Christopher Hollowell (Brookhaven National Laboratory)
    13/10/2016, 15:00
    Track 6: Infrastructures
    Oral

    Traditionally, the RHIC/ATLAS Computing Facility (RACF) at Brookhaven National Laboratory has only maintained High Throughput Computing (HTC) resources for our HEP/NP user community. We've been using HTCondor as our batch system for many years, as this software is particularly well suited for managing HTC processor farm resources. Recently, the RACF has also begun to design/administrate some...

    Go to contribution page
  44. Philippe Charpentier (CERN)
    13/10/2016, 15:15
    Track 6: Infrastructures
    Oral

    In order to estimate the capabilities of a Computing slot with limited processing time, it is necessary to know with a rather good precision its “power”. This allows for example pilot job to match a task for which the required CPU work is known, or to define the number of events to be processed knowing the CPU work per event. Otherwise one always has the risk that the task is aborted because...

    Go to contribution page
Building timetable...