HEPiX Autumn 2019 Workshop

Europe/Amsterdam
Turingzaal (Amsterdam Science Park Congress Centre)

Turingzaal

Amsterdam Science Park Congress Centre

Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
Helge Meinhard (CERN), Tony Wong (Brookhaven National Laboratory)
Description

HEPiX Autumn 2019 at Nikhef, Amsterdam, The Netherlands

The HEPiX forum brings together worldwide Information Technology staff, including system administrators, system engineers, and managers from High Energy Physics and Nuclear Physics laboratories and institutes, to foster a learning and sharing experience between sites facing scientific computing and data challenges.

Participating sites include BNL, CERN, DESY, FNAL, IHEP, IN2P3, INFN, IRFU, JLAB, KEK, LBNL, NDGF, NIKHEF, PIC, RAL, SLAC, TRIUMF, many other research labs and numerous universities from all over the world.

The workshop will be hosted by the Dutch National Institute for Subatomic Physics Nikhef (formerly known as NIKHEF) at Science Park in Amsterdam, The Netherlands, at the neighbouring Amsterdam Science Park Congress Centre.

Co-located events

Although not part of the HEPiX workshop, two co-located events have been organised: a workshop on ARC on Friday 18 October in the afternoon, and a workshop of the WLCG Security Operations Centre working group from Monday 21 to Wednesday 23 October.

Surveys
Lunch on Friday
Videoconference Rooms
HEPiX_Workshop
Name
HEPiX_Workshop
Description
Vidyo virtual room for the HEPiX workshops
Extension
10637013
Owner
Dennis Van Dok
Auto-join URL
Useful links
Phone numbers
    • 08:00 09:00
      Registration 1h Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 09:00 09:30
      Miscellaneous Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Helge Meinhard (CERN), Tony Wong (Brookhaven National Laboratory)
    • 09:30 10:30
      Site Reports Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Michele Michelotto (Università e INFN, Padova (IT)), Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR))
    • 10:30 11:00
      Coffee break 30m Eulerzaal

      Eulerzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 11:00 12:30
      Site Reports Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR)), Michele Michelotto (Università e INFN, Padova (IT))
    • 12:30 14:00
      Lunch break 1h 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 14:00 15:15
      End-user Services, Operating Systems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Andreas Haupt (Deutsches Elektronen-Synchrotron (DESY)), Georg Rath (Lawrence Berkeley National Laboratory)
      • 14:00
        CERN Fixed Telephony Service Development 25m

        The MALT project is a unique opportunity to consolidate all the current CERN telephony services (commercial PBX-based analogue/IP and proprietary IP) into a single IP-based cost-effective service, built on top of existing open source components and local developments, adapted to CERN users' needs, well integrated into the local environment and really multiplatform. This presentation describes the main technical aspects of this project as well as the features offered by this new service at a time when its pilot phase starts.

        Speaker: Thomas Baron (CERN)
      • 14:25
        Unified Home folders migration or chronicles of a complex data migration 25m

        Moving users’ data is never an easy process. When you migrate to a different file system, and would like to fit to users’ needs on Windows, Linux and Mac for more than 15 000 accounts, then the magic recipe becomes as difficult as the one for making perfect macaroons !
        In this presentation, you will learn key facts on how we handle this complex data migration.

        Speaker: Vincent Nicolas Bippus (CERN)
      • 14:50
        Challenges and opportunities when migrating CERN e-mail system to open source 25m

        E-mail service is considered as a critical collaboration system. I will share our experience as CERN, regarding technical and organizational challenges when migrating 40 000 mailboxes from Microsoft Exchange to free and open source software solution: Kopano.

        Speaker: Thomas Baron (CERN)
    • 15:15 15:45
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 15:45 16:10
      End-user Services, Operating Systems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Georg Rath (Lawrence Berkeley National Laboratory), Andreas Haupt (Deutsches Elektronen-Synchrotron (DESY))
      • 15:45
        Self-service web hosting made easy with containers and Kubernetes Operators 25m

        CERN Web Services are in the process of consolidating web site and web application hosting services using container orchestration.
        The Kubernetes Operator pattern has gained a lot of traction recently. It applies Kubernetes principles to custom applications.
        I will present how we leverage the Operator pattern in container-based web hosting services to automate the provisioning and management of web sites, web applications and the container infrastructure itself.

        Speaker: Alexandre Lossent (CERN)
    • 16:30 19:30
      Bus Transfer, Heineken Experience, Welcome Reception 3h
    • 08:30 09:00
      Registration 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 09:00 10:45
      Site Reports Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Michele Michelotto (Università e INFN, Padova (IT)), Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR))
      • 09:00
        GSI site report 15m

        News from GSI IT

        Speaker: Mr Christopher Huhn
      • 09:15
        AGLT2 Site Report Fall 2019 15m

        We will present an update on our site since the Spring 2019 report, covering our changes in software, tools and operations.

        Some of the details to cover include our use of backfilling jobs via BOINC with cgroups, work with our ELK stack at AGLT2, updates on Bro/MISP at the UM site and information about our newest hardware purchases and deployed middleware.

        We conclude with a summary of what has worked and what problems we encountered and indicate directions for future work.

        Speaker: Shawn Mc Kee (University of Michigan (US))
      • 09:30
        Short presentation of Strasbourg's IN2P3/CNRS T2 15m

        Short usual site presentation, as we'll be hosting the next 2020 autumn HEPIX meeting, for people to know us a bit.
        If no room left, fine, if just 2-5mn fine too.

        Speaker: Yannick Patois (Centre National de la Recherche Scientifique (FR))
      • 09:45
        KEK Site Report 15m

        Updates of the KEK projects, including SuperKEKB and J-PARC as well as on the KEK computing research center from the last HEPiX workshop, will be presented.

        Speaker: Tomoaki Nakamura (High Energy Accelerator Research Organization (JP))
      • 10:00
        NERSC Site Report 15m

        This will be a quick update on what is happening at NERSC.

        Speaker: Cary Whitney (LBNL)
      • 10:15
        IHEP Site Status 15m

        Computing center of IHEP has been supporting several HEP experiments for many years. LHCb is the new experiment we supported this year. We just upgraded AFS, HTCondor and EOS at IHEP.The presentation talks about the its current status and next plan.

        Speaker: Jingyan Shi (IHEP)
      • 10:30
        INFN-T1 Site report 15m

        An update on what's going on at INFN-T1 site

        Speaker: Mr Andrea Chierici (Universita e INFN, Bologna (IT))
    • 10:45 11:15
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 11:15 12:05
      End-user Services, Operating Systems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Georg Rath (Lawrence Berkeley National Laboratory), Andreas Haupt (Deutsches Elektronen-Synchrotron (DESY))
      • 11:15
        Zero-touch Windows 10 migration: dream or reality? 25m

        In the past, migrating from one Windows version to the latest one needed a full reinstallation of every single workstation, with all the inconveniences this represents for both users and IT staff.
        For Windows 10, Microsoft claimed that the in-place upgrade works fine. How true is this statement?
        This presentation will cover real-life feedback.

        Speaker: Vincent Nicolas Bippus (CERN)
      • 11:40
        CERN Linux services status update 25m

        An update on CERN Linux support distributions and services.
        An update on the CentOS community and CERN involvement will be given.
        We will discuss software the collections, virtualization and OpenStack SIGs update.
        Future plans regarding alternative architectures (ARM for SoCs, etc.) and CentOS 8.

        Speaker: Daniel Abad (CERN)
    • 12:05 12:30
      Computing and Batch Services Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Manfred Alef (Karlsruhe Institute of Technology (KIT)), Michel Jouvin (Centre National de la Recherche Scientifique (FR))
      • 12:05
        Report from HTCondor workshop in September 25m

        The successful series of HTCondor workshops in Europe started in 2014 continued in 2019 with a workshop held from 24 to 27 September at the European Commission's Joint Research Centre in Ispra, Lombardy, Italy. We will give a short report of this workshop.

        Speaker: Helge Meinhard (CERN)
    • 12:30 12:40
      Photo session 10m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 12:40 14:00
      Lunch break 1h 20m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 14:00 15:40
      Computing and Batch Services Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Manfred Alef (Karlsruhe Institute of Technology (KIT)), Michel Jouvin (Centre National de la Recherche Scientifique (FR))
      • 14:00
        Running an HTC cluster with fully containerised jobs using HTCondor, Singularity, CephFS and CVMFS 25m

        In this talk we present an HTC cluster which has been set up
        at Bonn University in 2017/2018. On this fully-puppetised cluster all jobs
        are run inside Singularity containers. Job management is handled
        by HTCondor which nicely shields the container setup from the users.
        The users only have to choose the desired OS via a job parameter from an
        offered collection of container images. The container images along with
        various software packages are provided by a CernVM filesystem (CVMFS).
        The data to be analysed is stored on a CephFS file system.

        The presentation describes how the various components are set up and
        provides some operational experience with this cluster.

        Speaker: Peter Wienemann (University of Bonn (DE))
      • 14:25
        What's new in HTCondor? What's coming up? 25m

        The goal of the HTCondor team is to to develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources. Increasingly, the work performed by the HTCondor developers is being driven by its partnership with the High Energy Physics (HEP) community.

        This talk will present recent changes and enhancements to HTCondor, including details on some of the enhancements created in recent releases, changes created on behalf of the HEP community, and the upcoming HTCondor development roadmap. We seek to solicit feedback on the roadmap from HEPiX attendees.

        Speaker: Todd Tannenbaum (University of Wisconsin Madison (US))
      • 14:50
        DESY Implementation and Usage of the HTCondor Batch System 25m

        The talk provides an overview of the DESY configurations for HTCondor. It
        focuses on features we need for user registry integration, node
        maintenance operations and fair share / quota handling. We are working on
        Docker, Jupyter and GPU integration into our smooth and transparent
        operating model setup.

        Speakers: Thomas Finnern (DESY), Christoph Beyer (DESY)
      • 15:15
        Testing the limits of transfer using HTCondor 25m

        In this talk we will provide details about the scalable limits of the HTCondor transfer mechanism. How it depends on latency, finish rate and how it compares with pure HTTP transfer.

        Speaker: Edgar Fajardo Hernandez (Univ. of California San Diego (US))
    • 15:40 16:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 16:10 17:25
      Computing and Batch Services Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Michel Jouvin (Centre National de la Recherche Scientifique (FR)), Manfred Alef (Karlsruhe Institute of Technology (KIT))
      • 16:10
        Experience in Running BEIJING-LCG2 WLCG Tier 2 Grid Site 25m

        BEIJING-LCG2 is a one of the WLCG Tier 2 grid site. In this topic I will introduce how to running a tire 2 grid site. Including deployment, configuration, monitoring, security, troubleshooting, and VO support.

        Speaker: Xiaofei Yan (Institute of High Energy Physics)
      • 16:35
        HEPiX CPU Benchmarking WG: status update 25m

        The benchmarking and accounting of compute resources in WLCG needs to be revised in view of the adoption by the LHC experiments of heterogeneous computing resources based on x86 CPUs, GPUs, FPGAs.
        After evaluating several alternatives for the replacement of HS06, the HEPIX benchmarking WG has chosen to focus on the development of a HEP-specific suite based on actual software workloads of the LHC experiments, rather than on a standard industrial benchmark like the new SPEC CPU 2017 suite.

        This presentation will describe the motivation and implementation of this new benchmark suite, which is based on container technologies to ensure portability and reproducibility. This approach is designed to provide a better correlation between the new benchmark and the actual production workloads of the experiments. It also offers the possibility to separately explore and describe the independent architectural features of different computing resource types, which is expected to be increasingly important with the growing heterogeneity of the HEP computing landscape. In particular, an overview of the initial developments to address the benchmarking of non-traditional computing resources such as HPCs and GPUs will also be provided.

        Speakers: Domenico Giordano (CERN), Christopher Henry Hollowell (Brookhaven National Laboratory (US))
      • 17:00
        HEP Workload Benchmarks: Design/Development 25m

        In this presentation we'll discuss the design architecture of the HEP Workload benchmark containers, and the proposed replacement for HEPSPEC06, which is based on these containers. We'll also highlight the development efforts which have been completed thus far, and the tooling being used by the project. Finally we'll detail our plan for extending the the existing container benchmark suite to include support for GPU benchmarking.

        Speaker: Christopher Henry Hollowell (Brookhaven National Laboratory (US))
    • 17:45 19:15
      HEPiX Board Meeting 1h 30m N3.28 (Nikhef)

      N3.28

      Nikhef

      Science Park 105 1098 XG Amsterdam The Netherlands

      by invitation

    • 08:30 09:00
      Registration 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 09:00 10:40
      Networking and Security Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: David Kelsey (Science and Technology Facilities Council STFC (GB)), Shawn Mc Kee (University of Michigan (US))
      • 09:00
        WLCG/OSG Network Activities, Status and Plans 25m

        WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The OSG Networking Area is a partner of the WLCG effort and is focused on being the primary source of networking information for its partners and constituents. We will report on the changes and updates that have occurred since the last HEPiX meeting.

        The primary areas to cover include the status of and plans for the WLCG/OSG perfSONAR infrastructure, the WLCG Throughput Working Group and the activities in the IRIS-HEP and SAND projects.

        Speaker: Marian Babik (CERN)
      • 09:25
        Analysing perfsonar data with elasticsearch 25m

        to be filled soon

        Speaker: Rolf Seuster (University of Victoria (CA))
      • 09:50
        Harnessing the power of threat intelligence for WLCG cybersecurity 25m

        The information security threats currently faced by WLCG sites are both sophisticated and highly profitable for the actors involved. Evidence suggests that targeted organisations take on average more than six months to detect a cyber attack, with more sophisticated attacks being more likely to pass undetected.

        An important way to mount an appropriate response is through the use of a Security Operations Centre (SOC). A SOC can provide detailed traceability information along with the capability to quickly detect malicious activity. The core building blocks of such a SOC are an Intrusion Detection System and a threat intelligence component, required to identify potential cybersecurity threats as part of a trusted community. The WLCG Security Operations Centre Working Group has produced a reference design for a minimally viable Security Operations Centre, applicable at a range of WLCG sites. In addition, another important factor in the sharing of threat intelligence is the formation of appropriate trust groups.

        We present the status and progress of the working group so far, including both a discussion of the reference SOC design and the approach of the working group to facilitating the collaboration necessary to form these groups, including both technological and social aspects. Threat intelligence and the formation of trust groups in our community will be the focus of the WLCG SOC WG workshop that will be taking place immediately following HEPIX, during 21-23 October 2019. We emphasise the importance of collaboration not only between WLCG sites, but also between grid and campus teams. This type of broad collaboration is essential given the nature of threats faced by the WLCG, which can often be a result of compromised campus resources.

        Speaker: David Crooks (Science and Technology Facilities Council STFC (GB))
      • 10:15
        Computer Security Update 25m

        This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of risks and compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves. It also covers security risks management in general, as well as the security aspects of the current hot topics in computing and around computer security.

        This talk is based on contributions and input from the CERN Computer Security Team.

        Speaker: Liviu Valsan (CERN)
    • 10:40 11:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 11:10 12:25
      Networking and Security Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Shawn Mc Kee (University of Michigan (US)), David Kelsey (Science and Technology Facilities Council STFC (GB))
      • 11:10
        Network Functions Virtualisation Working Group Update 25m

        High Energy Physics (HEP) experiments have greatly benefited from a strong relationship with Research and Education (R&E) network providers and thanks to the projects such as LHCOPN/LHCONE and REN contributions, have enjoyed significant capacities and high performance networks for some time. RENs have been able to continually expand their capacities to over-provision the networks relative to the experiments needs and were thus able to cope with the recent rapid growth of the traffic between sites, both in terms of achievable peak transfer rates as well as in total amount of data transferred. For some HEP experiments this has lead to designs that favour remote data access where network is considered an appliance with almost infinite capacity. There are reasons to believe that the network situation will change due to both technological and non-technological reasons starting already in the next few years. Various non-technological factors that are in play are for example anticipated growth of the non-HEP network usage with other large data volume sciences coming online; introduction of the cloud and commercial networking and their respective impact on usage policies and securities as well as technological limitations of the optical interfaces and switching equipment.

        As the scale and complexity of the current HEP network grows rapidly, new technologies and platforms are being introduced that greatly extend the capabilities of today’s networks. With many of these technologies becoming available, it’s important to understand how we can design, test and develop systems that could enter existing production workflows while at the same time changing something as fundamental as the network that all sites and experiments rely upon. In this talk we’ll give an update on the working group's recent activities, updates from sites and R&E network providers as well as plans for the near-term future.

        Speaker: Marian Babik (CERN)
      • 11:35
        Upgrade of KEK Campus network 25m

        In August 2018, we upgraded our campus network. We replaced core switches, border routers, distribution switches to provide 1G/10G connectivity with authentication to end nodes. We have newly introduced firewall sets to segment inner subnets into several groups and transplanted all access control lists from core switches to the inner firewall.
        We report our migration and operation history of last year.

        Speaker: Soh Suzuki
      • 12:00
        IPv6-only networking – update from the HEPiX IPv6 Working Group 25m

        The transition of WLCG central and storage services to dual-stack IPv4/IPv6 is progressing well, thus enabling the use of IPv6-only CPU resources as agreed by the WLCG Management Board. More and more WLCG data transfers now take place over IPv6. During this year, the HEPiX IPv6 working group has not only been chasing and supporting the transition to dual-stack services, but has also been encouraging network monitoring providers to allow for filtering of plots by the IP protocol used. The dual-stack deployment does however result in a networking environment which is much more complex than when using just IPv6. Some services, e.g. the EOS storage system at CERN, are using IPv6-only for internal communication, where possible. The group is investigating the removal of the IPv4 protocol in more places. We will present our recent work and future plans.

        Speaker: Martin Bly (STFC-RAL)
    • 12:25 14:00
      Lunch break 1h 35m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 14:00 15:40
      Networking and Security Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Shawn Mc Kee (University of Michigan (US)), David Kelsey (Science and Technology Facilities Council STFC (GB))
      • 14:00
        NOTED: Network-Optimized Transfer of Experimental Data 25m

        We describe the software tool-set being implemented in the context of the NOTED [1] project to better exploit WAN bandwidth for Rucio and FTS data transfers, how it has been developed and the results obtained.
        The first component is a generic data-transfer broker that interfaces with Rucio and FTS. It identifies data transfers for which network reconfiguration is both possible and beneficial, translates the Rucio and FTS information into parameters that can be used by network controllers and makes these available via a public interface.
        The second component is a network controller that, based on the parameters provided by the transfer broker, decides which actions to apply to improve the path for a given transfer.
        Unlike the transfer broker, the network controller described here is tailored to the CERN network as it has to choose the appropriate action given the network configuration and protocols used at CERN. However, this network controller can easily be used as a model for site-specific implementations elsewhere.
        The paper describes the design and the implementation of the two tools, the tests performed and the results obtained. It also analyses how the tool-set could be used for WLCG in the context of the DOMA [2] activity.
        [1] Network Optimisation for Transport of Experimental Data - CERN project
        [2] Data Organisation, Management and Access - WLCG activity

        Speaker: Coralie Busse-Grawitz (ETH Zurich (CH))
      • 14:25
        The SAND Project at the Halfway Point 25m

        The NSF funded SAND project was created to leverage the rich network-related dataset being collected by OSG and WLCG, including perfSONAR metrics, LHCONE statistics, HTCondor and FTS transfer metrics and additional SNMP data from some ESnet equipment. The goal is to create visualizations, analytics and user-facing alerting and alarming related to the research and education networks used by HEP, WLCG and OSG communities.

        We will report on the project status half-way through its initial 2-year funding period and cover what has been achieved as well as highlighting some new collaborations, tools and visualizations.

        Speaker: Shawn Mc Kee (University of Michigan (US))
      • 14:50
        CERN Computer Center Network evolution 25m

        Due to the amount of data expected from the experiments during RUN3, the CERN Computer Center network has to be upgraded. This presentation will explain all the ongoing works around the Computer Center network: change of router models to provide higher 100G ports density, links upgrade between the experiments and the Computer Center (CDR links), expected closure of Wigner Computer Center and move of the CPU servers to dedicated containers, creation of a WDM (up to 1Tbps) connection between ALICE containers and main Computer Center, introduction of router redundancy and Layer2 flexibility with VxLAN, etc…

        Speaker: Vincent Ducret (CERN)
      • 15:15
        LHCb containers - Network overview 25m

        Network overview concerning the new LHCb containers located at LHC point 8.
        A total of 184 switches installed connected to 4 different routers.
        New DWDM line system will be used to connect the IT datacentre extension in the LHCb containers.

        Speaker: Daniele Pomponi (CERN)
    • 15:40 16:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 16:10 17:00
      IT Facilities and Business Continuity Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Wayne Salter (CERN), Peter Gronbech (University of Oxford (GB))
      • 16:10
        How to provide required resources for Run3 and Run4 25m

        This presentation will cover how CERN is proposing to provide the computing capacity needed for the LHC experiments for RUN3 and for RUN4. It will start with some history on the failed attempt to have a second Data Centre ready for RUN3, then describe the solution adopted for RUN3 instead and finally the current plans for RUN4.

        Speaker: Wayne Salter (CERN)
      • 16:35
        Open Compute Project 2019 Global Summit Report 25m

        The Open Compute Project (OCP) is an organization that shares designs for data centre products among companies.
        Its mission is to design and enable the delivery of the most efficient server, storage and data centre hardware designs for scalable computing.
        The project was started in 2011, and includes today about 200 members.
        This talk will give a report from the 2019 OCP Global Summit, highlighting the most interesting talks and keynotes.
        Part of the presentation will focus also on Open19, a specification that defines a cross-industry common server form factor whose goal is to create flexible and economic data centres for operators of all sizes.
        Finally, it will also be discussed how OCP and Open19 could be relevant for CERN's computing infrastructure and for the HEPiX community.

        Speaker: Luca Atzori (CERN)
    • 18:30 21:30
      Workshop dinner 3h
    • 08:30 09:00
      Registration 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 09:00 10:40
      Basic IT Services Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Erik Mattias Wadenstein (University of Umeå (SE)), Jingyan Shi (IHEP)
      • 09:00
        Building a 21st century monitoring infrastructure 25m

        The monitoring infrastructure used at the computing centre at DESY, Zeuthen
        aged over the years and showed more and more deficits in many areas.

        In order to cope with current challenges, we decided to build up a new
        monitoring infrastructure designed from scratch using different open source
        products like Prometheus, ElasticSearch, Grafana, etc.

        The talk will give an overview of our future monitoring landscape
        as well as a status report on where we are now, which challenges we hit and an
        outlook to further developments.

        Speaker: Andreas Haupt (Deutsches Elektronen-Synchrotron (DESY))
      • 09:25
        Update on Federated Identity Management and AAI 25m

        A number of co-located meetings were held at Fermilab in early September in the area of Federated Identities and AAI (Authentication and Authorisation Infrastructures) for Physics, including a F2F meeting of the WLCG Authorization Working Group and a mini-FIM4R meeting. This talk gives a high-level overview of these meetings and related recent progress in this area.

        Speaker: David Crooks (Science and Technology Facilities Council STFC (GB))
      • 09:50
        SciTokens and Credential Management 25m

        Presentation on SciTokens, a distributed authorization framework, and work to integrate distributed authorization technologies such as SciTokens and OAuth 2.0 into HTCondor.

        Speaker: Todd Tannenbaum (University of Wisconsin Madison (US))
      • 10:15
        Federated ID/SSO at BNL's SDCC 25m

        BNL SDCC(Sentific Data and Computing Center) recently enabled SSO authentication strategy using Keycloak, supporting various SSO authentication protocols(SAML/OIDC/OAuth), and allowing multiple authentication options provided under one umbrella including Kerberos Auth, AD(Active Directory) and Federated Identity Authentication via CILogon with Incommon and social provider login. This solution has been integrated to recent tools/services deployment in the facility for protected resource access and delivered the efficiency in the areas of AuthN/AuthZ.This talk will focus on technical overviews and strategies to tackle the challenges/obstacles for this solution.

        Speaker: Mizuki Karasawa (BNL)
    • 10:40 11:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 11:10 12:00
      Basic IT Services Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Jingyan Shi (IHEP), Erik Mattias Wadenstein (University of Umeå (SE))
      • 11:10
        Monitoring of an IB network 25m

        I'll be showing collection and presentation tools for monitoring an IB network. Also discussing the ideas behind some of the collection decisions.

        Speaker: Cary Whitney (LBNL)
      • 11:35
        Cost and system performance modelling in WLCG and HSF: an update 25m

        The increase in the scale of LHC computing during Run 3 and Run 4 (HL-LHC) will certainly require radical changes to the computing models and the data processing of the LHC experiments. The working group established by WLCG and the HEP Software Foundation to investigate all aspects of the cost of computing and how to optimise them has continued producing results and improving our understanding of this process. In this contribution we expose our recent developments and results and outline the directions of future work.

        Speaker: Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)
    • 12:00 12:25
      Storage and Filesystems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E

      Please note that the Platinum sponsor of the workshop, Fujifilm Recording Media, has contributed material that will not be presented, but is available for consultation. See this contribution: https://indico.cern.ch/event/810635/contributions/3596108/

      Conveners: Ofer Rind, Peter van der Reest (DESY)
      • 12:00
        CERN Database on Demand Update 25m

        An update on the CERN Database on Demand service, which hosts more than 800 databases for the CERN user community supportin different open source systems such as MySQL, PostgreSQL and InfluxDB.

        We will present the current status of the platform and the future plans for the service.

        Speaker: Ignacio Coterillo Coz (CERN)
    • 12:25 14:00
      Lunch break 1h 35m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 14:00 15:40
      Storage and Filesystems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E

      Please note that the Platinum sponsor of the workshop, Fujifilm Recording Media, has contributed material that will not be presented, but is available for consultation. See this contribution: https://indico.cern.ch/event/810635/contributions/3596108/

      Conveners: Peter van der Reest (DESY), Ofer Rind
      • 14:25
        Evolution of the STFC’s Tape archival service 25m

        The STFC CASTOR tape service is responsible for the management of over 80PB of data including 45PB generated by the LHC experiments for the RAL Tier-1. In the last few years there have been several disruptive changes that have or are necessitating significant changes to the service. At the end of 2016, Oracle, which provided the tape libraries, drives and media announced they were leaving the tape market. In 2017, the Echo (Tier-1 disk) storage service entered production and disk only storage migrated away from CASTOR. In 2017, CERN, which provides support for CASTOR, started to test their replacement to CASTOR called CTA.
        Since October 2018, a new shared CASTOR instance has been in production. This instance is a major simplification from the previous four. In this presentation I describe the setup and performance of this instance which includes two sets of failure-tolerant management nodes that ensure improved reliability and a single unified tape cache that has displayed increased access rates to tape data compared to previous separate tape cache pools.
        In March 2019, a new Spectra Logic Tape robot was delivered to RAL. This uses both LTO and IBM media. I will present the tests that were carried out on this system, which includes multiple sets of dense and sparse tape reads to assess the throughput performance of the library for various use cases.
        Finally, I will describe the ongoing work exploring possible new, non-SRM tape management systems that will eventually replace CASTOR.

        Speaker: George Patargias (STFC)
      • 14:50
        Current status of tape storage at CERN 25m

        The IT storage group at CERN provides tape storage to its users in the form of three services, namely TSM, CASTOR and CTA. Both TSM and CASTOR have been running for several decades whereas CTA is currently being deployed for the very first time. This deployment is for the LHC experiments starting with ATLAS this year. This contribution describes the current status of tape storage at CERN and expands on the strategy and architecture of the current deployment of CTA.

        Speaker: Dr Steven Murray (CERN)
      • 15:15
        CERN Storage Evolution 25m

        In this contribution the evolution of the CERN storage services and their applications will be presented.

        The CERN IT Storage group's main mandate is to provide storage for Physics data: to this end an update will be given about CASTOR and EOS, with a particular focus on the ongoing migration from CASTOR to CTA, its successor.

        More recently, the Storage group has focused on providing higher-level tools to access, share and interact with the data. CERNBox is at the center of this strategy, as it has evolved to become the CERN apps hub. We will show how the recent (well known) changes in software licensing has affected the CERN apps portfolio offered to users.

        Finally, a new EU-funded project will be briefly presented, which perfectly integrates with the above strategy to expand the CERNBox collaboration to other institutions and enterprises.

        Speaker: Giuseppe Lo Presti (CERN)
    • 15:40 16:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 16:10 17:25
      Storage and Filesystems Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E

      Please note that the Platinum sponsor of the workshop, Fujifilm Recording Media, has contributed material that will not be presented, but is available for consultation. See this contribution: https://indico.cern.ch/event/810635/contributions/3596108/

      Conveners: Ofer Rind, Peter van der Reest (DESY)
      • 16:10
        CephFS in an HTC cluster and VMs on Ceph RBD with TRIM and differential backups 25m

        CephFS is used as the shared file system of the HTC cluster for
        physicists of various fields at Bonn University since beginning
        of 2018. The cluster uses IP over InfiniBand. High performance
        for sequential reads is achieved even though erasure coding and
        on-the-fly compression are employed.

        CephFS is complemented by a CernVM-FS for software packages and
        containers which come with many small files.

        Operational experience with CephFS and exporting it via NFS
        Ganesha to users’ desktop machines, upgrade experiences, and
        design decisions e. g. concerning the quota setup will be
        presented.

        Additionally, Ceph RBD is used as backend for a libvirt/KVM based
        virtualisation infrastructure operated by two institutes
        replicated across multiple buildings.

        Backups are performed via regular snapshots which allows for
        differential backups using open-source tools to an external
        backup storage. Via file system trimming through VirtIO-SCSI and
        compression of the backups, significant storage is saved.

        Writeback caching allows to achieve sufficient performance. The
        system has been tested for resilience in various possible failure
        scenarios.

        Speaker: Oliver Freyermuth (University of Bonn (DE))
      • 16:35
        Storage usage statistics through Hadoop big data technologies 25m

        As one the main data centres in France, the IN2P3 Computing Center (CC-IN2P3, https://cc.in2p3.fr) provides several High Energy Physics and Astroparticles Physics experiments with different storage systems that cover the different needs expressed by these experiments.

        The quantity of data stored at CC-IN2P3 is growing exponentially. In 2019, about two billion files are stored. By 2030, this number of files is expected to increase by a factor of eight.

        To monitor and supervise these storage systems, several applications leverage file metadata. Information such as size, number of blocks, last access time, or last update time for each storage system are used to export customized views to users, experiments, local experts, and support and management teams. However, these applications are usually monolithic and thus do not scale well. With the load expected by 2030, with a total amount of 4 TB of metadata, this could become problematic.

        To improve the scaleability of these applications, CC-IN2P3 has initiated a research and development project couple months ago. The idea is to build on data analytics framework such as Hadoop/Spark/... to process the expected massive amount of storage metadata in a scalable way.
        Our objective is to set up a software architecture that will act as scalable back end for all the existing and future monitoring and supervision applications. This should improve the production time of day-to-day statistics across all the storage services that are made available to the users, experiments, the support team and management of the CC-IN2P3.

        A long-term objective of this project is to become able to supervise the whole life cycle of data stored on the resources of the CC-IN2P3 and thus to ensure that the Data Management Plans provided are respected by the experiments.

        In this talk we will present the current status of this ongoing project, discuss the technical choices we made, present some preliminary results, and also expose the different issues we encountered along the road to success.

        Speaker: Antoine Dubois (CNRS)
      • 17:00
        Integration & Optimization of BNL Storage management 25m

        The Scientific Data & Computing Center (SDCC) in BNL is responsible for accommodating the diverse requirements for storing and processing petabyte-scale data generated by ATLAS, Belle II, PHENIX, STAR, Simons etc. This talk presents the current operational status of the main storage services supported in SDCC, summarizes our experience in operating largely distributed systems, optimizing in ATLAS Data Carousel, participating in Third Party Copy smoke testing of DOMA working group (DOMA-TPC) and moving toward the infrastructure of STAR/PHENIX Central storage. The presentation will also highlight our efforts of Ceph pools, XROOTD cache and BNL Box.

        Speaker: Yingzi Wu (Brookhaven National Laboratory (US))
    • 08:30 09:00
      Registration 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 09:00 10:40
      Grids, Clouds and Virtualisation Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Ian Collier (Science and Technology Facilities Council STFC (GB)), Tomoaki Nakamura (High Energy Accelerator Research Organization (JP))
      • 09:00
        ComputeOps: container for High Performance Computing 25m

        The High Performance Computing (HPC) domain aims to optimize code in order to use the last multicore and parallel technologies including specific processor instructions. In this computing framework, portability and reproducibility are key concepts. A way to handle these requirements is to use Linux containers. These "light virtual machines" allow to encapsulate applications within its environment in Linux processes. Containers has been recently rediscovered due to their abilities to provide both multi-infrastructure environnement for developers and system administrators and reproducibility due to image building file. Two container solutions are emerging: Docker for micro-services and Singularity for computing applications. We present here the status of the ComputeOps project which has the goal to study the benefit of containers for HPC applications.

        Speakers: Cecile Cavet (APC), Dr Aurélien Bailly-Reyre
      • 09:25
        CERN Cloud Infrastructure update 25m

        CERN runs a private OpenStack Cloud with ~300K cores, ~3K users and several OpenStack services.
        CERN users can build services from a pool of compute and storage resources using OpenStack APIs such as Ironic, Nova, Magnum, Cinder and Manila.
        For that reason, CERN cloud operators face some operational challenges at scale in order to offer these services in a stable manner.
        In this talk, you will learn about the status of the CERN cloud, new services and plans for expansion.

        Speaker: Daniel Abad (CERN)
      • 09:50
        Status and Operation of LHAASO Computing Platform 25m

        The Large High Altitude Air Shower Observatory (LHAASO) experiment of IHEP is located in Daocheng, Sichuan province (at the altitude of 4410 m), which generates a huge large amount of data and requires massive storage and large computing power.
        This article will introduce the current status of LHAASO computing platform at Daocheng. And focus on virtualization technologies such as docker k8s and distributed monitoring technologies to reduce the operation and maintenance cost as well as to make sure the system availability and stability.

        Speaker: Wei Zheng (IHEP)
      • 10:15
        Data Lake. Configuration and testing of distributed data storage systems. 25m

        The need for an effective distributed data storage has appeared important from the beginning of LHC, and this topic has become particularly vital in the light of the preparation for the HL-LHC run and the emergence of data-intensive projects in other domains such as nuclear and astroparticle physics.
        LHC experiments have started an R&D within the DOMA project and we report the recent results related to the federated data storage systems configuration and testing. We will emphasize on different system configurations and various approaches to test storage federations. We are considering EOS and dCache storage systems as a backbone software for data federation and xCache for data caching. We’ll also report about synthetic tests and experiments specific tests developed by ATLAS and ALICE for federated storage prototype in Russia. Recently, the execution of the test has been automated and now it is conducted using the HammerCloud toolkit. Data Lake project launched in the Russian Federation in 2019 and its prospects will be covered distinctly.
        ake project launched in the Russian Federation in 2019 and its prospects will be covered distinctly.

        Speaker: Andrey Zarochentsev (St Petersburg State University (RU))
    • 10:40 11:10
      Coffee break 30m Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
    • 11:10 12:00
      Grids, Clouds and Virtualisation Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Tomoaki Nakamura (High Energy Accelerator Research Organization (JP)), Ian Collier (Science and Technology Facilities Council STFC (GB))
      • 11:10
        The SLATE Project Update 25m

        We will provide an update on the SLATE project (https://slateci.io), an NSF funded effort to securely enable service orchestration in Science DMZ (edge) networks across institutions. The Kubernetes-based SLATE service provides a step towards a federated operations model, allowing innovation of distributed platforms, while reducing operational effort at resource providing sites.

        The presentation will focus on updates since the spring HEPiX meeting and cover our expanding collaboration, containerized service application catalog, and updates from an engagement with TrustedCI.org and the WLCG Security teams to collect issues of concerns to a new trust model.

        Speaker: Shawn Mc Kee (University of Michigan (US))
      • 11:35
        Distributed Computing at the JGI: A Grid-like approach for the life sciences 25m

        The Joint Genome Institute (JGI) is a part of the US department of energy and is serving the scientific community with access to high-throughput, high-quality sequencing, DNA synthesis, metabolomics and analysis capabilities. With ever increasing complexity of analysis workflows, and the demand burstable compute, it became necessary to be able to shift those workloads between sites. In this talk we will present JAWS, the JGI Analysis and Workflow system, which enables users to model their workflows using the Workflow Definition Language (WDL) and bring them to execution on a geographically distributed number of sites. We will discuss the architecture of JAWS, from underlying technologies, to data transfer and integration with HPC schedulers (eg Slurm). We will go into challenges encountered when running at multiple sites, among them integration and identity management and will present the status quo of our efforts.

        Speaker: Georg Rath (Lawrence Berkeley National Laboratory)
    • 12:00 12:30
      Miscellaneous Turingzaal

      Turingzaal

      Amsterdam Science Park Congress Centre

      Science Park 123 1098 XG Amsterdam The Netherlands 52°21'23"N, 4°57'7"E
      Conveners: Helge Meinhard (CERN), Tony Wong (Brookhaven National Laboratory)
      • 12:00
        Workshop wrap-up 30m
        Speaker: Helge Meinhard (CERN)