Site news report from IHEP, including status of computing platform construction, grid, network, storage and so on, since last workshop report.
This is the PIC report for HEPiX Autumn 2021 Workshop
A brief update on what's going on at INFN-T1
News from the lab
Diamond Light Source is a Synchrotron Light Source based at the RAL site. This is a summary of what Diamond has been up to in cloud, storage and compute, as well as a few extras.
CentOS Stream is a great place to develop for whats next in RHEL. But what if I want to have a hybrid infrastructure? How should I think about compatibility between CentOS Stream and released versions of RHEL?
As part of the changing Linux landscape, we now have to support more Linux distributions with higher release cadence than ever before. In order to adapt to these changes, we must automate the entire process to remove the human bottlenecks from the equation.
In this presentation, we will discuss how CERN automates the release of new packages for CentOS Linux 8, CentOS Stream 8 and soon...
Back in 2019, CERN Linux Support had to run tedious manual procedures to maintain CERN’s distro releases: SLC6, CERN CentOS 7, Red Hat 6 and 7. Since then, we have added CentOS 8, CentOS Stream 8, Red Hat 8, and we may be adding other Red Hat rebuilds soon. Given the growing number of supported distros, our team has been increasingly adopting automation and continuous integration in order to...
In December 2020, Red Hat and CentOS announced the early end-of-life of CentOS 8 (scheduled for December 2021), and its replacement with CentOS Stream 8. Unlike CentOS, which is a clone of RHEL, CentOS Stream is a forward-distribution of RHEL, containing package updates before they are released, or possibly ever included in RHEL. In this talk, we discuss the history and current status of...
In conjunction with our proposal to CERN and the broader HEPIX community, I wanted to share the basis for the proposal and provide an opportunity for questions to be asked.
In this talk, we would like to report the recent update and status on the KEK central computing system, Grid services, and international network situation in Japan from the previous HEPiX workshop.
We present an update of the changes at our site since the last report. Advancements, developments, roadblocks and achievements made concerning various aspects including: WLCG, Unix, Windows, Infrastructure, will be presented.
The Helmoltz-based platform [HIFIS][1] builds and sustains an IT infrastructure connecting all Helmholtz research fields and centres.
The [services][2] provided by HIFIS include a secure and easy-to-use collaborative environment with efficiently accessible IT services from anywhere. HIFIS further supports Research Software Engineering (RSE) with a high level of quality, visibility and...
How can we turn a "chore" into a community of makers, in a scientific Organization that doesn't enforce strict standards? Drupal has a long history at CERN. Site builders of >1k unique websites take advantage of a highly automated, but bespoke infrastructure, that automates website operations: provision, backup, clone, update, delete. Relying on direct support through tickets worked, until...
A summary of the annual "European" HTCondor workshop recently held on-line
We will present an update on our site since the Fall 2019 report, covering our changes in software, tools and operations. In addition, we will cover significant changes that are underway at both the University of Michigan and Michigan State University sites.
We conclude with a summary of what has worked and what problems we encountered and indicate directions for future work.
...
HIFIS: VO Federation for EFSS
Following the first rough ideas on Virtual Organisation (VO; Community AAI [1] based group of any size) based Enterprise File Sync&Share (EFSS) Federation [2] which were presented by HIFIS [3] on CS3 Conference 2021, we have since moved further along working on a first implementation. During Summer 2021, we have clarified...
We present a technical report on the new ATLAS Analysis Facility hosted at the University of Chicago. Designed to support both traditional batch computing and novel analysis frameworks, this facility represents a shift in how future clusters will be deployed, federated, monitored and operated. We will describe the "cloud native" underpinnings of the facility (Kubernetes, GitOps, and Rook), how...
The Benchmarking WG will present the semestral report about its activity that since few years is focused on the delivery of a new benchmark for HEP, HEPscore, together with a set of software tools that enable the WLCG community to seamlessly run and collect benchmark results (HEP Benchmark Suite) and to maintain the HEP reference applications needed for the benchmarking purposes (HEP...
Status and plans of the task force
CNAF Tier-1, composed of almost 1000 worker nodes and nearly 40000 cores, completed its migration to HTCondor more than one year ago. After having adapted existing monitoring tools (built with Sensu, Influx and Grafana) to work with the new batch system, an effort has started to collect a more rich and “condor oriented” set of metrics that are used to provide better insights on the pool...
In this contribution we are going to report on the latest results of the R&D activity aiming at preparing the EOS ALICE O2 storage cluster for the extremely demanding requirements of LHC Run 3.
Taking into consideration the latest upgrades of the LHC and of the ALICE detectors, the data throughput from the ALICE Data Acquisition system is expected to increase significantly, reaching 100GB/s...
Given the anticipated increase in the amount of scientific data, it is widely accepted that primarily disk based storage will become prohibitively expensive. Tape based storage, on the other hand, provides a viable and affordable solution for the ever increasing demand for storage space. Coupled with a disk caching layer that temporarily holds a small fraction of the total data volume to allow...
EOS is the open source distributed storage technology developed in the CERN IT
Department and used at the Large Hadron Collider (LHC). EOS has been operated in
production for more than 10 years and it now manages over half an exabyte of
disk storage for both LHC & non LHC experiments. Since its first deployment in
2010, the software has evolved a lot catering to the large amounts of data...
While WLCG may be considered at the vanguard of data-intense
scientific research, many other scientific communities are finding
their data storage requirements growing beyond their current
capabilities. Simultaneously, with science increasingly involving
broad collaborations, the ability to support and manage scientists
from different institutes is becoming essential.
At DESY, we have...
This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of risks to and compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves.
The COVID-19 pandemic has introduced a novel challenge for security teams...
On account of a question after the subject came up in site reports: what do sites have planned in terms of measures to increase users security awareness
To counteract the spread of the COVID-19 virus as much as possible, a device has been developed at CERN, the Proximerter, which sends information about contact tracing via an IoT network. In this respect, some hurdles had to be overcome in terms of data protection and compromises concerning the network and the actual protocol.
This presentation reports on the ongoing migration of archival library and tape technology at The Scientific Data and Computing Center (SDCC). With the planned transition from Oracle libraries to IBM libraries, we deployed our first IBM TS4500 Library with ~20k tape slots in early 2021. This talk discusses our experience with the IBM TS4500, our continued transition to these libraries, as...
CASTOR was used as CERN's primary archival storage system for the last two decades, including Run-1 and Run-2 of the LHC. For Run-3, CASTOR has been replaced by the CERN Tape Archive (CTA). At the end of Run-2, there were 340 Petabytes of data stored in CASTOR, which had to be migrated to CTA during Long Shutdown 2. Over 90% of this data is an active archive — the custodial copy of physics...
The RX protocol inherited from IBM AFS is incapable of filling a network pipe with a single RPC when the pipe's bandwidth delay product exceeds 44 1/4 KB. On a 1 Gbit/sec pipe with a 1ms RTT, the maximum theoretical throughput is 360 Mbit/sec with a maximum window size of 44 KB. The RX ACK packet format provides for a theoretical maximum window of 65535 packets but the Selective...
The Belle II experiment is a detector coupled to the SuperKEKB electron-positron collider, designed to collect 50 times the data produced by the previous generation of B-factories. Process and analyzing these high volumes of data in a timely fashion requires an efficient interconnection of software and computing resources. In this talk, the analysis model on the Belle II experiment is...
A retrospective view of the transition from Vidyo to Zoom platform.
Input and Views from the experiments
As ATLASSIAN is changing its product offering, DESY has started to evaluate alternatives to the products from the firm's product portfolio.
For the field of repositories and CI/CD software development, a suitable replacement for Jira Software/Bitbucket/Bamboo was found in gitlab.
This talk will outline the pilot project, the current production system and the offering we make to users to...
In late 2020, CC-IN2P3 decided to look for a service desk ticketing system as an alternative to OTRS.
During last winter we carried out a search for a new candidate and chose Zammad.
We transitionned to Zammad in summer and are now decommissionning OTRS.
This talk will present the features offered by Zammad and the lessons learned from that transition.
Last year we deployed a new automated backup and recovery setup for the Database on Demand Service at CERN. This setup comprises of 2 services-
1. Backup service - Stores encrypted and zipped backups of all DBOD production instances to EOS storage.
2. Restore service - Continuously restores and tests connectivity of snapshots of all production...
The Weblecture service is in charge of capturing, processing & delivery CERN productions: e-learning, Computing Seminars, Conferences, Outreach events, CERN related communication e.g. DG, HR, Staff association, etc.. Tightly linked to Webcast and Videoconference services from which most AV content is provided. The services delivers AV content on different formats so our community can watch...
As the scale and complexity of the current HEP network grows rapidly, new technologies and platforms are being introduced that greatly extend the capabilities of today’s networks. With many of these technologies becoming available, it’s important to understand how we can design, test and develop systems that could enter existing production workflows while at the same time changing something as...
WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing.
The OSG Networking Area is a partner of the WLCG effort and is focused on being the primary source of networking information for its partners and...
Increasing use of cloud resources, and other developments in new workflows, have posed an important question on which certificate providers are most appropriate for different use cases. Certificate authorities under discussion include Let’s Encrypt but also the CAs of commercial cloud providers. We discuss the posing of this question along with the key stakeholders, including sites,...
During this year the HEPiX IPv6 working group has continued to encourage the deployment of dual-stack IPv4/IPv6 services. We also recommend dual-stack clients (worker nodes etc). Many data transfers are happening today over IPv6. This talk will present our recent work including the ongoing planning for moving to an IPv6-only core WLCG.
The threat faced by the research and education sector from determined and well-resourced attackers has been growing in recent years and is now acute. We must act together as a community to defend against these attacks. A vital means of achieving this is to share threat intelligence - key indicators of compromise of an ongoing incident including network locations and file hashes - with trusted...
DESY's current user registry is about to retire. During its
use boundary conditions have shifted, the user groups have
become more heterogeneous and in addition to collaborations
with stable memberships the photon science community brought
a higher rate of fluctuation of IT users. The successor in-
corporates those changes and delivers functionality based
on Oracle's database (database,...