HEPiX Spring 2023 Workshop

Name: HEPiX Spring 2023 Workshop
Start: 2023-03-27T09:00:00+08:00
End: 2023-03-31T18:00:00+08:00
Location: Research Center for Environmental Changes (RCEC), Academia Sinica

27 Mar 2023, 09:00 → 31 Mar 2023, 18:00 Asia/Taipei

1F Conference Room (Research Center for Environmental Changes (RCEC), Academia Sinica )

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E

Peter van der Reest, Tomoaki Nakamura

Description

HEPiX Spring 2023 at ASGC, Taipei, Taiwan

The HEPiX forum brings together worldwide Information Technology staff, including system administrators, system engineers, and managers from High Energy Physics and Nuclear Physics laboratories and institutes, to foster a learning and sharing experience between sites facing scientific computing and data challenges.

Participating sites include BNL, CERN, DESY, FNAL, IHEP, IN2P3, INFN, IRFU, JLAB, KEK, LBNL, NDGF, NIKHEF, PIC, RAL, SLAC, TRIUMF, many other research labs and numerous universities from all over the world.

The workshop is hosted by Academia Sinica Grid Computing (ASGC) Center, Taipei Taiwan.

Co-located events

The International Symposium on Grids and Clouds 2023, beginning on Sunday, March 19th, is also organized by ASGC.

Organisers

hepix-conference-support@hepix.org

+886-2-27822120

Monday 27 March
- 09:00
  
  Registration 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Miscellaneous 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Peter van der Reest, Tomoaki Nakamura
  - 1
    
    Welcome
    
    Speakers: Peter van der Reest, Tomoaki Nakamura
    
    2023-03-27_HEPiX_Opening.pdf
  - 2
    
    Logistics
    
    Speakers: Eric Yen (ASGC), Han-Wei Yen
    
    HEPiX-Logistics.pdf
- Site Reports 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE)), Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR))
  - 3
    
    ASGC site report
    
    ASGC site report
    
    Speakers: Felix.hung-te Lee (Academia Sinica (TW)), Ms Jingya You (ASGC)
    
    ASGCRpt-HEPiX2023.pdf
  - 4
    
    CERN site report
    
    News from CERN since the last HEPiX workshop. This talk gives a general update from services in the CERN IT department.
    
    Speaker: Jarek Polok (CERN)
    
    CERN site report - HEPiX 2023 Spring.pdf
  - 5
    
    PIC report
    
    This is the PIC report for HEPiX Spring 2023 Workshop
    
    Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    
    HEPiX_Spring_2023_PIC_Report_JFlix.pdf
  - 6
    
    KEK Site Report
    
    The KEK Central Computer System (KEKCC) is a computer service and facility that provides large-scale computer resources, including Grid and Cloud computing systems and essential IT services, such as e-mail and web services.
    
    Following the procurement policy for the large-scale computer system requested by the Japanese government, we replace the entire system once every four or sometimes five years. The current system has replaced the previous system and has been in production since September 2020, and decommissioning will be planned to begin in Q3 of 2024.
    
    During about 30 months of operation in the current system, we have decommissioned some legacy Grid services, like LFC, and migrated some Grid services to the newer operating system, CentOS7. In this talk, we would like to share our experiences and challenges regarding Grid services introduced in the KEKCC. Also, we will review the ongoing activity to enable Grid services in the token-only environment.
    
    Speaker: Go Iwai (KEK)
    
    iwai.pdf
- 11:30
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Site Reports 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE)), Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR))
  - 7
    
    IHEP Site Report
    
    Site news report from IHEP, CAS, including status of computing platform construction, grid, network, storage and so on, since last workshop report.
    
    Speaker: Xuantong Zhang (Chinese Academy of Sciences (CN))
    
    HEPiX2023Spring_report.pdf
  - 8
    
    DESY site report
    
    an overview of developments at DESY
    
    Speaker: Peter van der Reest
    
    HEPiX_Spring2023_DESY_site_report.pdf
  - 9
    
    FZU (Prague) Site Report
    
    The usual site report.
    
    Speaker: Jiri Chudoba (Czech Academy of Sciences (CZ))
    
    FZU site report
  - 10
    
    AGLT2 Site Report Spring 2023
    
    We will present an update on our site since the Fall 2021 report, covering our changes in software, tools and operations.
    
    Some of the details to cover include the details of our recent hardware purchases, our network upgrades and our preparations to select and implement our next operating system and associated provisioning systems. We will also discuss our work with Elasticsearch and our efforts to implement the WLCG Security Operations Center components. We conclude with a summary of what has worked and what problems we encountered and indicate directions for future work.
    
    Speaker: Shawn Mc Kee (University of Michigan (US))
    
    AGLT2SiteReport-HEPiXSpring2023.pdf
    
    AGLT2 Site Report Spring HEPiX 2023
- 13:00
  
  Lunch break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Basic IT Services & End User Services 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Dennis van Dok (Nikhef), Erik Mattias Wadenstein (University of Umeå (SE)), 石京燕 shijy
  - 11
    
    Dynamic Deployment of Data Collection and Analysis Stacks at CSCS
    
    Complexity and scale of systems increase rapidly and the amount of related
    monitoring and accounting data grows accordingly.
    
    Managing this vast amount of data is a challenge that CSCS solved by
    introducing a Kubernetes cluster dedicated to dynamically deploying
    data collection and analysis stacks comprising Elastic Stack, Kafka and
    Grafana both for internal usage and for external customers' use cases.
    
    This service proved to be crucial at CSCS to provide correlation of
    events and meaningful insights from event-related data: bridging the
    gap between the computation workload and resources status enables
    failure diagnosis, telemetry and effective collection of accounting
    data.
    
    Currently at CSCS the main production Elastic Stack is handling more than
    200B online documents. The integrated environment from data collection to
    visualization let internal and external users produce their own powerful
    dashboards and monitoring displays that are fundamental for their data
    analysis needs.
    
    Speaker: Mr Dino Conciatore (CSCS (Swiss National Supercomputing Centre))
    
    CSCS - Dynamic Deployment of Data Collection and Analysis Stacks - HEPiX 2023.pptx
  - 12
    
    Monitoring Windows Server Infrastructure using Open Source products
    
    More than 500 servers are actively managed by the Windows Infrastructure team on the CERN site. These servers run critical services for the laboratory such as controlling some of the accelerator most critical systems through Terminal Servers, managing all CERN users and computers registered in Active Directory, hosting accelerator designs in DFS storage or enabling Engineering software licensing. Having full time visibility on their state is critical for a smooth operation of the laboratory. In 2021, in the context of replacing Microsoft System Center Configuration manager as an in-depth Windows host monitoring system, a project was launched to implement an open source lightweight icinga2 ecosystem. This presentation will describe the implementation of such system and the technical choices and configurations made to transparently deploy and manage the icinga2 infrastructure across the Windows Infrastructure at CERN.
    
    Speaker: Mr Pablo Martin Zamora (CERN)
    
    Monitoring Windows Server Infrastructure using Open-Source products.pdf
  - 13
    
    The unified identity authentication platform for IHEP
    
    The Institute of High Energy Physics of the Chinese Academy of Sciences is a comprehensive research base in China engaged in high -energy physical research, advanced accelerator physics and technology research and development and utilization, and advanced ray technology and application.
    The Sing sign on(SSO) system of the High Energy Institute has more than 22,000 users, the calculation cluster (AFS) users 3.2K, Web applications of more than 150, and more than 10 client applications. With the development of the high energy institute, international cooperation has become more and more frequent, so the SSO system of the high energy institute has been generated.
    The SSO system of the High Energy Institute integrates all personnel systems and AFS user accounts. And realize the Chinese certification federation CARSI and international federation EduGain access, not only realize the unified account management of the within the place, but also gradually realize domestic universities and international organization certification.
    
    Speaker: qi luo (中科院高能物理所计算中心)
    
    The design of the unified identity authentication system for IHEP.pdf
    
    The design of the unified identity authentication system for IHEP.pptx
- 15:45
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Basic IT Services & End User Services 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Dennis van Dok (Nikhef), Erik Mattias Wadenstein (University of Umeå (SE)), 石京燕 shijy
  - 14
    
    Status of CERN Authentication and Authorisation
    
    Authentication and Authorisation is the core service to secure access for computing resources at any large-scale organisation. At CERN we handle around 25,000 logins per day of 35,000 individual users, granting them access to more than 9,000 applications and websites that use the organisation's Single Sign-On (SSO). To achieve this, we have built an Identity and Access Management platform based on open source and commercial software. CERN has also many different needs and use cases, which needed to be adapted or implemented by leveraging existing solutions and protocols. These needs included a general need for machine-to-machine automated authentication, CLI access and two-factor authentication (2FA). We will describe our authentication landscape and focus on key challenges that we hope will be relevant for other communities.
    
    Speaker: Asier Aguado Corman (CERN)
    
    Status of CERN Authentication and Authorisation - CodiMD
    
    Status of CERN Authentication and Authorisation.pdf
  - 15
    
    Federated ID and Token Transition Status
    
    Present the current status and roadmap for federated ID via Comanage Registry at CILOGON as well as progress on implementing token based services at Brookhaven National Laboratory Scientific Data and Computing Center.
    
    Speaker: Robert Hancock
    
    HepixSpring2023FederatedAccess.pdf
    
    HepixSpring2023FederatedAccess.pptx
- 18:30
  
  Welcome Reception
Tuesday 28 March
- 09:00
  
  Registration 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Grid, Cloud & Virtualisation and Operating Systems 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Haupt (DESY), Ian Collier (Science and Technology Facilities Council STFC (GB)), Tomoaki Nakamura
  - 16
    
    Cloud Infrastructure Update: Operations, Campaigns and Evolution
    
    CERN has been running an OpenStack based private cloud infrastructure in production since 2013. This presentation will give a status update of the deployment, and then dive into specific topics, such as the 12-month work to replace the network control plane for 4400 virtual machines or the live-migration machinery used for interventions or reboot campaigns.
    
    Speaker: Maryna Savchenko (CERN)
    
    Cloud Infrastructure update from CERN.pdf
  - 17
    
    Providing ARM and GPU resources in the CERN Private Cloud Infrastructure
    
    The CERN Cloud Infrastructure service has recently commissioned a set of ARM and GPU nodes as hypervisors. This presentation will cover all the steps required to prepare the provisioning of ARM based VMs: the creation of multi-arch docker images for our GitLab pipelines, the preparation of ARM user images, or adaptions to the PXE and Ironic setup to manage this additional architecture. There would be also presented an overview of Cloud GPU resources and explained the difference between PCI passthrough, vGPU and multi-instance GPU.
    
    Speaker: Maryna Savchenko (CERN)
    
    ARM Support in the CERN Private Cloud Infrastructure.pdf
    
    Providing ARM and GPU resources in the CERN Private Cloud Infrastructure
  - 18
    
    Kubernetes cluster for Helmholtz users
    
    The Helmholtz Association's federated IT platform HIFIS enables the individual Helmholtz centres to share IT services and resources for the benefit of all users in the association. For that purpose a central service catalog - the Helmholtz cloud portal - lists all these services so that scientists, technicians and administrators can make use of them. In this context, DESY offers access to a Kubernetes platform managed by Rancher for all users with a clear purpose to test and try their application deployments on Kubernetes without having to pay for resources at a commercial provider.
    
    Using Kubernetes as a development and deployment platform for web-based applications has become a de-facto standard in industry as well as in the cloud-based open source community. We observe that many useful services and tools can easily be deployed on Kubernetes if one has a cluster at hand, which is not as commonplace at the moment as it could be. With offering Kubernetes to the Helmholtz Association's members, DESY hopes to contribute to a more widespread adoption of modern cloud-based workflows in science and its surroundings. The abstraction layer Kubernetes offers makes for more reusable software in the long run, which would be benefitial to the whole scientific community.
    
    In our presentation at the HEPiX workshop, we will show how we deploy our Kubernetes clusters using Rancher and other tools and which applications we deem necessary to achieve basic usability of the clusters. A major part of the presentation will be on the integration of the clusters with the Helmholtz AAI, resource management and integration with development workflows. Finally we will highlight use cases from different Helmholtz centres that already make use of our clusters and how they gained access and first introductions to the platform itself.
    
    Speaker: Tim Wetzel
    
    wetzel-hepix2023-spring-k8s-for-helmholtz.pdf
- 11:15
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Grid, Cloud & Virtualisation and Operating Systems 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Haupt (DESY), Ian Collier (Science and Technology Facilities Council STFC (GB)), Tomoaki Nakamura
  - 19
    
    Fully automated: Updates on the Continuous Integration for supported Linux distributions at CERN
    
    Historically the release processes for supported CERN Linux distributions involved tedious manual procedures that were often prone to human error. In addition, and as a knock-on effect from the turmoil created in 2020 with the CentOS Linux 8 end-of-life announcement; the CERN Linux team have now been required to support an increasing number of Linux distributions.
    To cope with this additional workload (currently 8 Linux distributions: CC7, CS8, CS9, RHEL7, RHEL8, RHEL9, ALMA8, ALMA9), our team have been forced to adopt full scale automation, testing and continuous integration; all whilst significantly reducing the need for human interventions.
    Automation can now be found in every part of our process: cloud and Docker image building, base-line testing, CERN specific testing and full-stack functional testing. For this we use a combination of GitLab CI capabilities, Koji, OpenStack Nova, OpenStack Ironic central services, nomad and a healthy dose of Python and Bash. Test suites now cover unmanaged, managed (puppet), virtual and physical machines; which allows us to certify that our next image release continues to meet the needs of the organization.
    
    Speaker: Ben Morrice (CERN)
    
    20230328-hepix-fully-automated.pdf
    
    20230328-hepix-fully-automated.pptx
  - 20
    
    Update on the Linux Strategy for CERN (and WLCG)
    
    This presentation will be a follow-up to the presentation and “Linux Strategy” BoF in Umea and summarise the latest evolution on the strategy for Linux at CERN (and WLCG): a recap of the situation (e.g. the issues with Stream or the changes for the RHEL license), a presentation of the agreed strategy as well as insights into the decision making process (in particular for the choice of AlmaLinux as the EL rebuild).
    
    Speaker: Ben Morrice (CERN)
    
    20230328-hepix-linux-update.pdf
    
    20230328-hepix-linux-update.pptx
- Site Reports 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE)), Dr Sebastien Gadrat (CCIN2P3 - Centre de Calcul (FR))
  - 21
    
    BNL Site Report
    
    News and updates from the Scientific Data & Computing Center (SDCC) at BNL
    
    Speaker: Robert Hancock
    
    HEPiX Spring 2023 BNL SDCC Site Report.pdf
    
    HEPiX Spring 2023 BNL SDCC Site Report.pptx
- Photo session 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- 13:15
  
  Lunch break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Grid, Cloud & Virtualisation and Operating Systems 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Andreas Haupt (DESY), Ian Collier (Science and Technology Facilities Council STFC (GB)), Tomoaki Nakamura
  - 22
    
    The WLCG Journey at CSCS: from Piz Daint to Alps
    
    The Swiss National Supercomputing Centre (CSCS), in close collaboration with the Swiss Institute for Particle Physics (CHiPP), provides the Worldwide LHC Computing Grid (WLCG) project with cutting-edge HPC and HTC resources. These are reachable through a number of Computing Elements (CEs) that, along with a Storage Element (SE), characterise CSCS as a Tier-2 Grid site. The current flagship system, an HPE Cray XC named Piz Daint, has been the platform where all the computing requirements for the Tier-2 have been met for the last 6 years. With the commissioning of the future flagship infrastructure, an HPE Cray EX referred to as Alps, CSCS is gradually moving the computational resources to the new environment. The Centre has been investing heavily in the concept of Infrastructure as Code (IaC) and it is embracing the multi-tenancy paradigm for its infrastructure. As a result, the project leverages modern approaches and technologies borrowed from the cloud to perform a complete re-design of the service. During this process, Kubernetes, Harvester, Rancher, and ArgoCD technologies have been playing a leading role, providing CSCS with enhanced flexibility in terms of the orchestration of clusters and applications. This contribution aims to describe the journey, design choices, and challenges encountered along the way to implement the new WLCG platform, which is also profited from by other projects such as the Cherenkov Array Telescope (CTA) and the Square Kilometre Array (SKA).
    
    Speaker: Dr Riccardo Di Maria (CERN)
    
    DiMaria_Riccardo_The_WLCG_Journey_at_CSCS_HEPiX_Spring_2023.pdf
  - 23
    
    An Update from the SLATE Project
    
    We will provide an update on the SLATE project (https://slateci.io), an NSF funded effort to securely enable service orchestration in Science DMZ (edge) networks across institutions. The Kubernetes-based SLATE service provides a step towards a federated operations model, allowing innovation of distributed platforms, while reducing operational effort at resource providing sites. The SLATE project is in its last year and is working to wrap up while also preparing for what comes next.
    
    The presentation will cover our recent efforts including transitioning to k8s (Kubernetes) 1.24 and adding OpenTelemety, revising our build and update system and augmenting our catalog of applications in preparation for future possibilities.
    
    Speaker: Shawn Mc Kee (University of Michigan (US))
    
    An Update from the SLATE Project
    
    HEPiX-SLATE-Spring2023.pdf
- Board Meeting (closed session) 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
Wednesday 29 March
- 09:00
  
  Registration 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Networking & Security 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: David Kelsey (Science and Technology Facilities Council STFC (GB)), Shawn Mc Kee (University of Michigan (US))
  - 24
    
    CERN Computer Centre(s) Network evolution: achievements during the last 3 years and expected next steps
    
    During HEPIX Fall 2019, a CERN presentation explained our plans to prepare the Computer Centre Network for LHC Run3. This year’s presentation will explain what has been achieved and the forthcoming steps. Points to be covered include:
    - the current CERN Datacentre Network architecture,
    - how we handled a full datacentre network migration during COVID19 lockdown period,
    - how the connections between the main datacentre and other CERN sites
    (including a 2.4Tbps link for ALICE O2 setup, and a total of 2.1Tbps
    connection to containers located on LHCb site) evolved,
    - the new tools and features we introduced (Zero Touch Provisioning for Juniper switches, Vlan support up to the ToR),
    - the issues we faced and how we handled them (bug affecting DHCPv6, hardware delivery delays),
    - datacentre network plans for 2023, and
    - the network setup and features to be deployed in the new Prévessin Computer Centre.
    
    Speaker: Vincent Ducret (CERN)
    
    Hepix Spring 2023 - CERN Computer Center Network evolution - Part I.pdf
    
    Hepix Spring 2023 - CERN Computer Center Network evolution - Part I.pptx
  - 25
    
    CERN Computer Centre(s) Network evolution: achievements during the last 3 years and expected next steps
    
    Speaker: Vincent Ducret (CERN)
    
    Hepix Spring 2023 - CERN Computer Center Network evolution - Part II.pdf
    
    Hepix Spring 2023 - CERN Computer Center Network evolution - Part II.pptx
- Basic IT Services & End User Services 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Dennis van Dok (Nikhef), Erik Mattias Wadenstein (University of Umeå (SE)), 石京燕 shijy
  - 26
    
    Token based solutions for SSH with OIDC
    
    OIDC (OpenID Connect) is widely used for transforming our digital
    infrastructures (e-Infrastructures, HPC, Storage, Cloud, ...) into the token
    based world.
    
    OIDC is an authentication protocol that allows users to be authenticated
    with an external, trusted identity provider. Although typically meant for
    web- based applications, there is an increasing need for integrating
    shell- based services.
    
    This contribution delivers an overview of several tools, each of which
    provides a solution to a specific aspect of using tokens on the
    commandline in production services:
    
    oidc-agent is the tool for obtaining oidc-access tokens on the
    commandline. It focuses on security and manages to provide ease of use
    at the same time. The agent operates on a users workstation or laptop
    and is well integrated with graphical user interfaces of several
    operating systems, such as Linux, MacOS, and Windows. Advanced features
    include agent-forwarding which allows users to securely obtain access
    tokens from remote machines to which they are logged in.
    
    mytoken is both, a server software and a new token type. Mytokens allow
    obtaining access tokens for long time spans, of up to multiple years. It
    introduces the concept of "capabilities" and "restrictions" to limit the
    power of long living tokens. It is designed to solve difficult use-cases
    such as computing jobs that are queued for hours before they run for
    days. Running (and storing the output of) such a job is straightforward,
    reasonably secure, and fully automisable using mytoken.
    
    pam-ssh-oidc is a pam module that allows accepting access tokens in the
    Unix pluggable authentication system. This allows using access tokens
    for example in ssh sessions or other unix applications such as su. Our
    pam module allows verification of the access token via OIDC or via 3rd
    party REST interfaces.
    
    motley-cue is a REST based service that works together with pam-ssh-oidc
    to validate access tokens. Along the validation of access tokens,
    motley-cue may - depending on the enabled features - perform additional
    useful steps in the "SSH via OIDC" use-case. These include
    Authorisation (based on VO membership)
    Authorisation (based on identity assurance)
    Dynamic user creation
    One-time-password generation (in case the access token is too long for
    the SSH-client used)
    Account provisioning via plugin based system (interfaces with local
    Unix accounts, LDAP accounts, and external REST interfaces)
    
    Account blocking (by authorised administrators in case of a security
    incident)
    
    mccli is a client side tool that enables clients to use OIDC
    access-tokens that normally do not support them. Currently, ssh, sftp
    and scp are supported protocols.
    
    oidc-plugin for putty makes use of the new putty plugin interface to use
    access tokens for authentication, whenever an ssh-server supports it.
    The plugin interfaces with oidc-agent for windows to obtain tokens.
    
    The combination of the tools presented allows creative new ways of using
    the new token-based AAIs with old and new tools. Given enough time, this
    contribution will include live-demos for all of the presented tools.
    
    Speaker: Marcus Hardt (Kalrsruhe Institute of Technology)
    
    2303-hepix-token-tools.pdf
    
    Token based solutions for SSH with OIDC
- 11:15
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Networking & Security 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: David Kelsey (Science and Technology Facilities Council STFC (GB)), Shawn Mc Kee (University of Michigan (US))
  - 27
    
    HEPiX IPv6 Working Group - Update
    
    The transition of WLCG storage services to dual-stack IPv6/IPv4 is nearing completion. Monitoring of data transfers shows that many are happening today over IPv6 but it is still true that many are not! The agreed endpoint of the WLCG transition to IPv6 remains the deployment of IPv6-only services, thereby removing the complexity and security concerns of operating dual stacks. The HEPiX IPv6 working group is investigating the obstacles to the use of IPv6 in WLCG. This talk will present our recent activities including investigations for the reasons behind the ongoing use of IPv4.
    
    Speaker: Bruno Heinrich Hoeft (KIT - Karlsruhe Institute of Technology (DE))
    
    HEPiX-IPv6-2023-29mar23.pdf
    
    HEPiX-IPv6-2023-29mar23.pptx
  - 28
    
    Status and Plans for the Research Networking Technical WG
    
    The high-energy physics community, along with the WLCG sites and Research and Education (R&E) networks have been collaborating on network technology development, prototyping and implementation via the Research Networking Technical working group (RNTWG) since early 2020.
    
    As the scale and complexity of the current HEP network grows rapidly, new technologies and platforms are being introduced that greatly extend the capabilities of today’s networks. With many of these technologies becoming available, it’s important to understand how we can design, test and develop systems that could enter existing production workflows while at the same time changing something as fundamental as the network that all sites and experiments rely upon.
    
    In this talk we’ll give an update on the Research Networking Technical working group activities, challenges and recent updates. In particular we’ll focus on the flow labeling and packet marking technologies (scitags), the new effort on packet pacing and related tools and approaches that have been identified as important first steps for the work of the group.
    
    Speaker: Shawn Mc Kee (University of Michigan (US))
    
    HEPiX Spring 2023 - RNTWG Update.pdf
    
    Status and Plans for the Research Networking Technical WG
  - 29
    
    perfSONAR Global Monitoring and Analytics Framework Update
    
    WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The IRIS-HEP/OSG-LHC Networking Area is a partner of the WLCG effort and is focused on being the primary source of networking information for its partners and constituents. We will report on the changes and updates that have occurred since the last HEPiX meeting.
    
    We will cover the status of, and plans for, the evolution of the WLCG/OSG perfSONAR infrastructure, as well as the new, associated applications that analyze and alert upon the metrics that are being gathered.
    
    Speaker: Shawn Mc Kee (University of Michigan (US))
    
    perfSONAR Global Monitoring and Analysis Framework Update
    
    perfSONAR Global Monitoring and Analytics Framework Update HEPiX Spring 2023.pdf
- 13:00
  
  Lunch break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Show Us Your ToolBox 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Convener: Peter van der Reest
- Networking & Security 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  - 30
    
    Computer Security Landscape Update
    
    This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of risks and compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves.
    
    Speaker: Liviu Valsan (CERN)
    
    2023-03__Computer_Security_Update__CERN.pdf
- 18:30
  
  Gala dinner
Thursday 30 March
- 09:00
  
  Registration 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Storage and Filesystems 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Ofer Rind (Brookhaven National Laboratory), Peter van der Reest
  - 31
    
    Migrating DPM to the federated NDGF-T1 dCache at the UNIBE-LHEP ATLAS Tier-2
    
    Following the DPM EOL announcement, we have considered options for transitioning to a supported Grid storage solution. Our choice has been to integrate our site storage in Bern with the NDGF-T1 distributed dCache environment. In this presentation we outline the option scenario, motivate our choice, and give a summary of the technical implementation as experienced from the remote site point of view.
    
    Speaker: Francesco Giovanni Sciacca (Universitaet Bern (CH))
    
    Bern-NDGF-dCache-HEPiXspring2023.pdf
  - 32
    
    Operating a federated dCache system
    
    Over the years, we have built a federated dCache over multiple sites. As of late, we have integrated a couple of new sites, as well as improved our automation and monitoring. This talk will focus on the current state of dCache deployment, administration, and monitoring at NDGF-T1.
    
    Speaker: Erik Mattias Wadenstein (University of Umeå (SE))
    
    20230330-DistributedOps.pdf
  - 33
    
    Status of CERN Tape Archive operations during Run3
    
    The CERN Tape Archive (CTA) is CERN’s physics data archival storage solution for Run-3. Since 2020, CTA has been progressively ramping up to serve all the LHC and non-LHC tape workflows which were previously handled by CASTOR. 2022 marked a very successful initial Run-3 data taking campaign on CTA, reaching the nominal throughput of 10 GB/s per experiment and setting new monthly records of archived data volume.
    
    In this presentation, we review key production service lessons learnt during the beginning of Run-3. The transition of Tier-0 tape from a Hierarchical Storage Model (HSM) to a pure tape endpoint has enabled multiple optimisations, initially targetting low-latency archival write performance of DAQ data. We will discuss how the redesign of tape workflows has led to gains in terms of performance and automation, and present several ongoing activities which improve production feedback to the experiments. Finally, we will present upcoming challenges for CTA, as tape storage is becoming “warmer” in all workflows.
    
    Speaker: Julien Leduc (CERN)
    
    230330_HEPIX_Status_of_CERN_Tape_Archive_operations_during_Run3.pdf
- 11:15
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Storage and Filesystems 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Ofer Rind (Brookhaven National Laboratory), Peter van der Reest
  - 34
    
    CERN Storage Services Status and Outlook for 2023
    
    CERN Storage Services are key enablers for CERN IT workflows: from scientific collaboration and analysis on CERNBox, to LHC data taking with CTA and EOS, to fundamental storage for cloud-native and virtualized applications.
    
    On the physics storage, 2022 was marked by the absence of a Heavy Ion run. The energy crisis resulted in a shorter LHC run than anticipated and the impact will continue to be felt during 2023. Consequently, the Proton run will target a higher integrated luminosity for most experiments and there is a likelihood of an extended Heavy Ion run. We will review the impact of the physics planning changes on the storage infrastructure and how CERN storage is preparing for the resulting increased data rates in 2023. Notably, ALICE storage workflows through EOS ALICE O2 and EOSCTA ALICE.
    
    We will review progress in HTTP protocol activities for T0 transfers as well as synergies on transverse functionalities, such as monitoring.
    
    Physics storage successfully migrated to EOS5 during 2022, 2023 is the migration year for all the other services running on EOS software: notably EOSCTA and CERNBox infrastructures. We summarise plans and evolution for CERNBox, building on the new refreshed platform reported at the last meeting.
    
    CERN maintains a large Ceph installation and we report on recent evolutions, including investments in delivering a high(er) available service and in hardening upstream features for backups and recovery.
    
    Last but not least, we will report on the storage group migration strategy out of CentOS7 which will be implemented in 2023.
    
    Speaker: Julien Leduc (CERN)
    
    230330_CERN_storage_service_outlook_for_2023.pdf
  - 35
    
    Archive Storage for Project sPhenix
    
    How we make use of tape systems to help big data processing for Project sPhenix.
    Project sPhenix has a projected data volume of 650PB through 2026. Using tape system smartly will provide us a fast and reliable storage and lower the storage cost.
    
    Speaker: Mr tchou@bnl.gov Chou (Brookhaven National Lab)
    
    Hepix2023sPhenix.pdf
- 12:35
  
  Lunch break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Computing and Batch Services 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Michel Jouvin (Université Paris-Saclay (FR)), Dr Michele Michelotto (Universita e INFN, Padova (IT))
  - 36
    
    HEPscore: a new benchmark for WLCG compute resources
    
    The HEPiX CPU Benchmark Working Group has developed a new CPU benchmark, called HEPScore, based on HEP applications. HEPScore will replace the current HEPSpec06 benchmark that is currently used by the WLCG for accounting and resource pledges.The new benchmark will be based on contributions by many WLCG experiments and will be able to run on x86 and ARM processor systems. We present the results that led to the current candidate for the HEPScore benchmark, which is expected to be released for production use in April 2023. We will briefly describe the transition plan for migrating from the current HEPSpec06 benchmark to HEPScore in 2023 and 2024. In addition, the current interest in reducing electricity consumption and minimizing the carbon footprint of HEP computing, focused the community on producing workloads that can run on ARM processors. We highlight some of the early results of the studies and the effort by the Working Group to include power utilization information into the summary information.
    
    Speaker: Randall Sobie (University of Victoria (CA))
    
    Sobie-Taipei-Bmk.pdf
  - 37
    
    Job Accounting for HTCondor with Heterogeneous Systems
    
    While ArcCE and other systems allow for a single HEPSpec06 value in their configuration that is used when reporting the accounting information to APEL/EGI, sites usually have more than one kind of systems. In such heterogeneous systems an average needs to be used which can't reflect the real CPU usage of jobs especially when running jobs for different VOs and with different run times. In such case it would be better to use a HEPSPEC06 value per job reflecting the real system where a job run.
    In this presentation we will show an easy solution that stores a HEPSPEC06 value together with the job information in HTCondor's job history and uses the condor job history to report to Apel. While there may be other solutions out there and developed over the last years, we think this may still be useful for other sites too. This solution may also be interesting once the new HEPScore benchmark will be used for accounting where one can run parts of it within a short time for example at boot time of a VM.
    
    Speaker: Dr Marcus Ebert (University of Victoria)
    
    HEPiX-032023-Condor.pdf
  - 38
    
    Cloudscheduler V.2 from an HTCondor admin's point of view
    
    We developed a new version of Cloudscheduler on which we reported before from a technical point of view. Cloudscheduler is a system that manages VM resources on demand on local and remote compute clouds depending on job requirements and makes those VMs available to HTCondor pools. Via cloud-init and yaml files, VMs can be provisioned depending on the needs of a VO.
    In this presentation, we will focus on our experience with the new Cloudscheduler running HTCondor jobs for HEP Grid jobs (Atlas and Belle-II), Astronomy (Dune) and HEP non-Grid jobs (BaBar) in a distributed HTCondor environment from a user and administrator point of view. We will show how it integrates with an existing HTCondor system and how it can be used to extend an existing pool with cloud resources when needed, for example in times of high demand or during down times of bare metal worker nodes, and how the system usage is monitored.
    
    Speaker: Dr Marcus Ebert (University of Victoria)
    
    HEPiX-032023-CSV2.pdf
- 15:20
  
  Coffee break 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
- Computing and Batch Services 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  
  Conveners: Michel Jouvin (Université Paris-Saclay (FR)), Dr Michele Michelotto (Universita e INFN, Padova (IT))
  - 39
    
    Towards energy efficient compute clusters at DESY
    
    Environmental and political constraints have made energy usage a top priority. As scientific computing draws significant power, sites have to adapt to the changing conditions and need to optimize their clusters utilization and energy consumption.
    We present our current status in our endeavour to make DESY's compute clusters more energy efficient. With a broad mix of different compute users with various use cases as well as overall power consumers on site, DESY faces a number of different power usage profiles but also the opportunity to integrate these in a more power efficient holistic approach. For example, age dependent load shedding of worker nodes combined with opportunistic utilization of unallocated compute resources will allow for an breathing adaptation to the dynamic green energy from off-shore wind energy farms.
    
    Speakers: Andreas Haupt (DESY), Christoph Beyer, Kai Leffhalm (Deutsches Elektronen-Synchrotron (DE)), Krunoslav Sever (Deutsches Elektronen-Synchrotron DESY), Thomas Hartmann (Deutsches Elektronen-Synchrotron (DE)), Yves Kemp
    
    HEPIX202303_DESYEnergy.pdf
    
    Towards more energy efficient compute clusters
  - 40
    
    Expand local cluster to the worker node of the remote site
    
    The Large High Altitude Air Shower Observatory (LHAASO) is a large-scale astrophysics experiment led by China. The whole experiment data is stored at the Institute of High Energy Physics(IHEP) local EOS file system and processed by the IHEP local HTCondor cluster. Since the experiment data has been increased rapidly, the CPU cores of the local cluster are not enough to support the data processing.
    As the LHAASO experimental cooperation groups’ resources are located geographically and most of them have the characteristics of limited scale, low stability, and lack of human support, it is difficult to integrate them via Grid. We designed and developed a system to expand LHAASO local cluster to the remote site. The system keeps the IHEP cluster as the main cluster and extends the cluster to the worker nodes of the remote site based on ” HTCondor startd automatic cluster joining”. LHAASO jobs are submitted to the IHEP cluster and are dispatched to the remote worker node in the system. We classified LHAASO job into several types and wrapped by the dedicated script which make the job have no direct access to IHEP local file system. User’s token is wrapped and transferred with the job to the remote worker nodes.About 125 worker nodes with 4k CPU cores at the remote site have been joined into IHEP LHAASO cluster till now and have produced 700TB simulation data in 6 months.
    
    Speaker: 石京燕 shijy
    
    expand_cluster.pdf
- Miscellaneous 1F Conference Room
  
  1F Conference Room
  
  Research Center for Environmental Changes (RCEC), Academia Sinica
  
  128 Academia Road, Section 2 Nankang, Taipei 11529 Taiwan 25°2′45″N 121°36′37″E
  - 41
    
    Closing Remarks
    
    Speakers: Peter van der Reest, Tomoaki Nakamura
    
    2023-03-30_HEPiX_WrapUp.pdf

Choose timezone

HEPiX Spring 2023 Workshop

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

HEPiX Spring 2023 at ASGC, Taipei, Taiwan

Co-located events

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica

1F Conference Room

Research Center for Environmental Changes (RCEC), Academia Sinica