Conveners
Facilities and Networks: Tue PM
- David Bouvet (IN2P3/CNRS (FR))
- Shawn McKee (University of Michigan (US))
Facilities and Networks: Wed AM
- Daniela Bauer (Imperial College (GB))
- David Bouvet (IN2P3/CNRS (FR))
Facilities and Networks: Wed PM
- David Crooks (UKRI STFC)
- Alessandra Forti (University of Manchester (GB))
Facilities and Networks: Thu PM
- Edoardo Martelli (CERN)
- David Crooks (UKRI STFC)
This paper evaluates the real-time distribution of data over Ethernet for the upgraded LHCb data acquisition cluster at CERN. The total estimated throughput of the system is 32 Terabits per second. After the events are assembled, they must be distributed for further data selection to the filtering farm of the online trigger. High-throughput and very low overhead transmissions will be an...
The High Luminosity Large Hadron Collider provides a data challenge. The amount of data recorded from the experiments and transported to hundreds of sites will see a thirty fold increase in annual data volume. A systematic approach to contrast the performance of different Third Party Copy (TPC) transfer protocols arises. Two contenders, XRootD-HTTPS and the GridFTP are evaluated in their...
Network traffic optimisation is difficult as the load is by nature dynamic and random. However, the increased usage of file transfer services may help the detection of future loads and the prediction of their expected duration. The NOTED project seeks to do exactly this and to dynamically adapt network topology to deliver improved bandwidth for users of such services. This article...
Infrastructures supporting distributed scientific collaborations must address competing goals in both providing high-performance access to resources while simultaneously securing the infrastructure against security threats. The NetBASILISK project is attempting to improve the security of such infrastructures while not adversely impacting their performance. This paper will present our work to...
The SARS COV 2 virus, the cause of the better known COVID-19 disease, has greatly altered our personal and professional lives. Many people are now expected to work from home but this is not always possible and, in such cases, it is the responsibility of the employer to implement protective measures. One simple such measure is to require that people maintain a distance of 2 metres but this...
To optimise the performance of distributed compute, smaller lightweight storage caches are needed which integrate with existing grid computing workflows. A good solution to provide lightweight storage caches is to use an XRootD-proxy cache. To support distributed lightweight XRootD proxy services across GridPP we have developed a centralised monitoring framework.
With the v5 release of...
One of the biggest challenges in the High-Luminosity LHC (HL- LHC) era will be the significantly increased data size to be recorded and an- alyzed from the collisions at the ATLAS and CMS experiments. ServiceX is a software R&D project in the area of Data Organization, Management and Access of the IRIS- HEP to investigate new computational models for the HL- LHC era. ServiceX is an...
Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each...
The CMS experiment at CERN employs a distributed computing infrastructure to satisfy its data processing and simulation needs. The CMS Submission Infrastructure team manages a dynamic HTCondor pool, aggregating mainly Grid clusters worldwide, but also HPC, Cloud and opportunistic resources. This CMS Global Pool, which currently involves over 70 computing sites worldwide and peaks at 300k CPU...
With more and more large-scale scientific facilities are built, more and more HPC requirements are needed in IHEP. RDMA is a technology that allows servers in a network to exchange data in main memory without involving the processor, cache or operating system of either server, which can provide high bandwidth and low latency. There are two RDMA technologies which were InfiniBand and a relative...
The distributed computing of the ATLAS experiment at LHC is using computing resources of the Czech national HPC center IT4Innovations for several years. The submission system is based on ARC-CEs installed at the Czech LHC Tier2 site (praguelcg2). Recent improvements of this system will be discussed here. First, there was migration of the ARC-CE from version 5 to 6 which improves the...
HPC resources will help meet the future challenges of HL-LHC in terms of CPU requirements. The Spanish HPC centers have been used recently by implementing all the necessary edge services to integrate the resources into the LHC experiments workflow management system. Since it not always possible to install the edge services on HPC premises, we opted to set up a dedicated ARC-CE and interact...
The Large Hadron Collider (LHC) will enter a new phase begin- ning in 2027 with the upgrade to the High Luminosity LHC (HL-LHC). The increase in the number of simultaneous collisions coupled with a more complex structure of a single event will result in each LHC experiment collecting, stor- ing, and processing exabytes of data per year. The amount of generated and/or collected data greatly...
Computational science, data management and analysis have been key factors in the success of Brookhaven National Laboratory's scientific programs at the Relativistic Heavy Ion Collider (RHIC), the National Synchrotron Light Source (NSLS-II), the Center for Functional Nanomaterials (CFN), and in biological, atmospheric, and energy systems science, Lattice Quantum Chromodynamics (LQCD) and...
The Rutherford Appleton Laboratory (RAL) runs the UK Tier-1 which supports all four LHC experiments, as well as a growing number of others in HEP, Astronomy and Space Science. In September 2020, RAL was provided with funds to upgrade its network. The Tier-1 not only wants to meet the demands of LHC Run 3, it also wants to ensure that it can take an active role in data lake development and...
The processing needs for the High Luminosity (HL) upgrade for the LHC require the CMS collaboration to harness the computational power available on non-CMS resources, such as High-Performance Computing centers (HPCs). These sites often limit the external network connectivity of their computational nodes. In this paper we describe a strategy in which all network connections of CMS jobs inside a...
CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes: access to the application software (cmvfs) and conditions data (Frontier), management of input and output data files (data management services), and job management...
Since 2017, the Worldwide LHC Computing Grid (WLCG) has been working towards enabling token based authentication and authorisation throughout its entire middleware stack. Following the publication of the WLCG v1.0 Token Schema in 2019, middleware developers have been able to enhance their services to consume and validate OAuth2.0 tokens and process the authorization information they convey....
The WLCG is modernizing its security infrastructure, replacing X.509 client authentication with the newer industry standard of JSON Web Tokens (JWTs) obtained through the Open ID Connect (OIDC) protocol. There is a wide variety of software available using the standards, but most of it is for Web browser-based applications and doesn’t adapt well to the command line-based software used heavily...
With more applications and services deployed in BNL SDCC that rely on authentication services, adoption of Multi-factor Authentication (MFA) became inevitable. While web applications can be protected by Keycloak (a open source Single sign-on solution directed by Red Hat) with its MFA feature, other service components within the facility rely on FreeIPA (an open source identity management...