HEPiX Spring 2021 online Workshop

Name: HEPiX Spring 2021 online Workshop
Start: 2021-03-15T08:00:00+01:00
End: 2021-03-19T19:00:00+01:00
Location: No location set

15–19 Mar 2021

Europe/Zurich timezone

Organisers

hepix-2021spring-support@hepix.org

Contribution List

34. KEK Site Report

Go Iwai (KEK)

15/03/2021, 08:10

Site Reports

The KEK Central Computer System (KEKCC) is a computer service and facility that provides large-scale computer resources, including Grid and Cloud computing systems and common IT services, such as e-mail and web services.

Following the procurement policy for the large scale computer system requested by the Japanese government, we replace the entire KEKCC every four or sometimes five years....

7. ASGC site report

Felix.hung-te Lee (Academia Sinica (TW))

15/03/2021, 08:25

Site Reports

ASGC site report

9. CERN Site Report

Andrei Dumitru (CERN)

15/03/2021, 08:40

Site Reports

News from CERN since the previous HEPiX workshop.

43. RAL Site Report

Martin Bly (STFC-RAL)

15/03/2021, 08:55

Site Reports

An update on developments and some plans at the RAL Tier1

16. PIC report

Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

15/03/2021, 09:10

Site Reports

This is the PIC report for HEPiX Spring 2021 Workshop

29. Daisy: Data analysis integrated software system for X-ray experiments

Haolai Tian (Institute of High Energy Physics)

15/03/2021, 09:45

End-User IT Services & Operating Systems

Daisy (Data Analysis Integrated Software System) has been designed for the analysis and visualization of the X-ray experiments. To address an extensive range of Chinese radiation facilities community’s requirements from purely algorithmic problems to scientific computing infrastructure, Daisy sets up a cloud-native platform to support on-site data analysis services with fast feedback and...

27. OnlyOffice and Collabora Online Experience at CERN

Maria Alandes Pradillo (CERN)

15/03/2021, 10:10

End-User IT Services & Operating Systems

Collaboration features are nowadays a key aspect for efficient team work with productivity tools. During 2020, CERN has deployed OnlyOffice and Collabora Online solutions and monitored their usage in CERNBox.

This presentation will focus on technical aspects of deploying and maintaining OnlyOffice and Collabora Online within CERN and their integration with CERNBox. It will also give an...

37. Windows desktop service from computer to user centric IT

Sebastien Dellabella (CERN)

15/03/2021, 10:35

End-User IT Services & Operating Systems

Over the last decades we mainly focused our MS Windows management policy on hardening machines, we wanted to control and manage how and when security updates were deployed, how software could be installed, licensed and monitored on a machine… But times have changed, IT has evolved and users can now be empowered and regain their freedom. Let’s see together which solutions we put in place to...

5. INFN-T1 site report

Mr Andrea Chierici (Universita e INFN, Bologna (IT))

15/03/2021, 16:10

Site Reports

A short presentation on what's going on at INFN-T1 site

21. BNL Site Report

Costin Caramarcu (Brookhaven National Laboratory (US))

15/03/2021, 16:25

Site Reports

An update on BNL activities since the Fall 2020 workshop

30. Diamond Light Source Site Report

Frederik Ferner

15/03/2021, 16:40

Site Reports

Diamond Light Source is a Synchrotron Light Source based at the RAL site. This is a summary of what Diamond has been up to in cloud, storage and compute, as well as a few extras.

50. GSI site report

Mr Christopher Huhn

15/03/2021, 16:55

Site Reports

News and status report from GSI

28. Canadian ATLAS Tier-1 site report

Di Qing (TRIUMF (CA))

15/03/2021, 17:10

Site Reports

News and updates of the Canadian ATLAS Tier-1 center over past years. The presentation will cover the site configuration and tools used, how we operate a 'federated' Tier-1 center and improve the CPU utilization.

15. Linux at CERN: current status and future

Ben Morrice (CERN)

15/03/2021, 17:45

End-User IT Services & Operating Systems

CERN has historically used RedHat derived Linux distribtions; favored for their relative stability and long life cycle. In December 2020, the CentOS board announced that the end-of-life for CentOS Linux 8 would be changed from a 10 year life cycle to 2 years.
This talk focuses on what CERN will be doing in the short-term to adapt to this announcement, and what the Linux future could look like...

13. Red Hat Products and Programs

Troy Dawson

15/03/2021, 18:10

End-User IT Services & Operating Systems

A "just the facts" look at the products and programs Red Hat offers, followed by a Question and Answer session.

In the past six months, Red Hat has made some dramatic announcements. We are aware that these announcements affect how the High Energy Physics community does computing. We want you, Hepix, to make the best informed decisions as you decide your next steps forward. This...

42. Building up and migrating to a FOSS-focused e-mail service at CERN

Vincent Brillault (CERN)

15/03/2021, 18:35

End-User IT Services & Operating Systems

This talk is an update on CERN's project to build up a new e-mail service at CERN, focused on Free and Open Source Software, and to migrate all of its users. As presented in HEPiX Autumn 2019, CERN has been working on migrating out of Microsoft Exchange since Spring 2018. However in early Spring 2020, in the middle of the migration...

26. CERN central DHCP service: Migration from ISC DHCP to Kea

Maria Hrabosova (CERN)

16/03/2021, 08:00

Networking & Security

Network & Security

The presentation discusses the change of the DHCP software used for the CERN central DHCP service, namely the migration from ISC DHCP to Kea. It outlines the motivation behind the replacement of ISC DHCP and describes the main steps of the transition process. It covers the translation of the current CERN ISC DHCP configuration, testing the new Kea configuration, and the implementation of the...

19. Computer Security Update

Liviu Valsan (CERN)

16/03/2021, 08:25

Networking & Security

Network & Security

This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of risks to and compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves. It also covers security risks management in general, as well as the security aspects...

18. IPv6-only on WLCG - update from the IPv6 working group

Dr Andrea Sciabà (CERN)

16/03/2021, 08:50

Networking & Security

Network & Security

The transition of WLCG storage and central services to dual-stack IPv4/IPv6 has gone well, thus enabling the use of IPv6-only CPU resources as mandated by the WLCG Management Board. Many WLCG data transfers now take place over IPv6. The dual-stack deployment does however result in a networking environment which is much more complex than when using just IPv4 or just IPv6. During recent months...

41. Unchaining JupyterHub: Running notebooks on resources without inbound connectivity

Dr Oliver Freyermuth (University of Bonn (DE))

16/03/2021, 09:35

Computing & Batch Services

JupyterLab has become an increasingly popular platform for rapid prototyping, teaching algorithms or sharing small analyses in a self-documenting manner.

However, it is commonly operated using dedicated cloud-like infrastructures (e.g. Kubernetes) which often need to be maintained in addition to existing HTC systems. Furthermore, federation of resources or opportunistic usage are not...

39. Dynamic integration of opportunistic compute resources

Peter Wienemann (University of Bonn (DE))

16/03/2021, 10:00

Computing & Batch Services

Exploitation of heterogeneous opportunistic resources is an important ingredient to fulfil the computing requirements of large HEP experiments in the future. Potential candidates for integration are Tier 3 centres, idling cores in HPC centres, cloud resources, etc. To make this work, it is essential to choose a technology which offers an easy integration of those resources into the computing...

49. Moving to HTCondor (and fighting covid in the middle)

Stefano Dal Pra (Universita e INFN, Bologna (IT))

16/03/2021, 10:25

Computing & Batch Services

On March 2020, INFN-T1 started the process of moving all the Worker Nodes managed by LSF to the HTCondor batch system, which was set up and tested in the previous months and was considered ready to handle the workload of the whole computing cluster. On March 20, while in the middle of the migration process, a sudden request came to provide 50% of our computing power for a period of one month...

10. WLCG Authorization WG Update

Hannah Short (CERN)

16/03/2021, 16:00

Networking & Security

Network & Security

Since 2017, the Worldwide LHC Computing Grid (WLCG) has been working towards enabling Token based authentication and authorisation throughout its entire middleware stack. Following the publication of the WLCGv1.0 Token Schema in 2019, middleware developers have been able to enhance their services to consume and validate OAuth2.0 tokens and process the authorization information they...

22. WLCG/OSG Network Activities, Status and Plans

Shawn Mc Kee (University of Michigan (US))

16/03/2021, 16:25

Networking & Security

Network & Security

WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The OSG Networking Area is a partner of the WLCG effort and is focused on being the primary source of networking information for its partners and...

23. Research Networking Technical WG Update

Marian Babik (CERN)

16/03/2021, 16:50

Networking & Security

Network & Security

As the scale and complexity of the current HEP network grows rapidly, new technologies and platforms are being introduced that greatly extend the capabilities of today’s networks. With many of these technologies becoming available, it’s important to understand how we can design, test and develop systems that could enter existing production workflows while at the same time changing something as...

24. Cybersecurity Framework for Research/Education Organizations

Bob Cowles (BrightLite Information Security)

16/03/2021, 17:15

Networking & Security

Network & Security

The Trusted CI Framework provides a structure for organizations to establish and, improve, and evaluate their cybersecurity programs. The framework empowers organizations to confront their cybersecurity challenges from a mission-oriented, programmatic, and full organizational lifecycle perspective.

The Trusted CI Framework is structured around 4 Pillars that support a cybersecurity program:...

17. The WLCG HEP-SCORE deployment task force

Helge Meinhard (CERN)

16/03/2021, 18:00

Computing & Batch Services

Following up from the work of the HEPiX benchmarking working group, WLCG launched a task force primarily tasked to concretely propose a successor to HEP-SPEC 06 as standard benchmark for CPU resources in WLCG. We will present an overview of the
mandate and composition of the task force and will report on status and plans.

31. HEP Benchmarks: updates and demo

Domenico Giordano (CERN)

16/03/2021, 18:25

Computing & Batch Services

Since 2 years the HEPiX Benchmarking Working Group has been developing a benchmark based on actual software workloads of the High Energy Physics community, called HEPscore. This approach, based on container technologies, is designed to provide a benchmark that is better correlated with the actual throughput of the experiment production workloads. In addition, the procedures to run and collect...

20. New institutional resources at BNL

Tony Wong (Brookhaven National Laboratory)

16/03/2021, 18:50

Computing & Batch Services

BNL's first institutional cluster is reaching the end of life, and it has started the process of replacing its capabilities with new resources. This presentation reviews historical usage of existing resources and describes the replacement process, including timelines, composition and plans for expansion of the user community that will use the new resources.

2. The design of Data Management System at HEPS

Hao Hu (Institute of High Energy of Physics)

17/03/2021, 08:00

Storage & Filesystems

Storage & File Systems

According to the estimated data rates, we predict 24 PB raw experimental data will be produced per month from 14 beamlines at the first stage of High Energy Photon Source (HEPS), and the volume of experimental data will be even greater with the completion of over 90 beamlines at the second stage in the future. To make sure that huge amount of data collected at HEPS is accurate, available and...

55. CTA production experience

Julien Leduc (CERN)

17/03/2021, 08:25

Storage & Filesystems

Storage & File Systems

The CERN Tape Archive is the tape back-end to EOS and the replacement for CASTOR for Run3 physics archival system.
The EOSCTA service entered production at CERN during summer 2020 and since then the 4 biggest LHC experiments have been migrated.
This talk will outline the challenges and the experience we accumulated during CTA service production ramp up as well as an updated overview of the...

33. Distribution of container images: From tiny deployments to massive analysis on the grid

Enrico Bocchi (CERN)

17/03/2021, 08:50

Storage & Filesystems

Storage & File Systems

In recent years, containers became the de-facto standard to package and distribute modern applications and their dependencies. A crucial role in the container ecosystem is played by container registries (specialized repositories meant to store and distribute container images) which have seen an ever-increasing need for additional storage and network capacity to withstand the demand from users....

40. Ceph at RAL

Mr Morgan Robinson (Science and Technology Facilities Council)

17/03/2021, 09:35

Storage & Filesystems

Storage & File Systems

The Rutherford Appleton Laboratory runs three production Ceph clusters providing: Object Storage to the LHC experiments and many others; RBD storage underpinning the STFC OpenStack Cloud and CephFS for local users of the ISIS neutron source. The requirements and hardware for these clusters is very different yet it is underpinned by the same storage technology. This talk will cover the status...

36. HARRY: Aggregate hardware usage metrics to optimise procurement of computing resources

Herve Rousseau (CERN)

17/03/2021, 10:00

IT Facilities & Business Continuity

Procuring new IT equipment for the CERN data centre requires
optimizing the computing power and storage capacity while minimizing
the costs. In order to achieve this, understanding how the existing
hardware resources are used in production is key.
To that extent, leveraging traditional monitoring data seems to be the
way to go.
This presentation will explain how we extract...

51. Evolving the Monitoring and Operations services at RAL [to overcome the challenges of 2020]

Mr Christos Nikitas (STFC)

17/03/2021, 10:25

IT Facilities & Business Continuity

STFC's Scientific Computing Department, based at RAL, runs an ever increasing number of services to support the High Energy Physics, Astronomy and Space Science Communities. RAL’s monitoring and operations services were already struggling to scale to meet these demands and the global pandemic highlighted the importance of these systems as home working was enforced. This talk will cover the...

14. LHC Run 3 tape infrastructure plans

Vladimir Bahyl (CERN)

17/03/2021, 16:00

Storage & Filesystems

Storage & File Systems

CERN IT-ST-TAB section will outline the tape infrastructure hardware plans for the upcoming LHC run 3 period. This presentation will discuss the expected configuration of the tape libraries, tape drives and the necessary quantity of the tape media.

25. Small-file aggregation for dCache tape interface

Mr Tigran Mkrtchyan (DESY), Ms Svenja Meyer (DESY)

17/03/2021, 16:25

Storage & Filesystems

Storage & File Systems

Since 2015 a so-called Small File Service has been deployed at DESY, to pack small files into containers before writing to tape. As existing detectors have been updated to run under higher trigger rates and new beamlines become operational, the number of arriving files has increased drastically, bringing the pack service to its limits. To cope with increased file arrival rate, the Small File...

11. XRootD5: what's in it for you?

Michal Kamil Simon (CERN)

17/03/2021, 16:50

Storage & Filesystems

Storage & File Systems

With the latest major release (5.0.0) XRootD framework introduced not only a multitude of architectural improvements and functional enhancements, but also brought a TLS based, secure version of the xroot/root data access protocol (a prerequisite for supporting access tokens). In this contribution we discuss all the ins and outs of the xroots/roots protocol including the importance of...

47. Magnetic Tape for Mass Storage in HEP

Shigeki Misawa (Brookhaven National Laboratory (US))

17/03/2021, 17:35

Storage & Filesystems

Storage & File Systems

Abstract: Storage technology has changed over the decade, as has the role of storage in experimental research. Traditionally, magnetic tape has been the technology of choice for archival and narrowly targeted near line storage. In recent years there has been a push to have tape play a larger role in near line storage. In this presentation, the economics of tape are examined in light of...

48. The HEPiX Erasure Coding Working Group

Alastair Dewhurst (Science and Technology Facilities Council STFC (GB)), Andreas Joachim Peters (CERN), Shigeki Misawa (Brookhaven National Laboratory (US))

17/03/2021, 18:00

Storage & Filesystems

Storage & File Systems

One of the recommendations to come out of the HSF / WLCG Workshop in November 2020 was to create an Erasure Coding Working Group. Its purpose is to help solve some of the data challenges that will be encountered during HL-LHC by enabling sites to store data more efficiently and robustly using Erasure Coding techniques. The working group aims to:
- To provide a forum to allow sites to...

46. SDCC Operations During Transition to the New Data Center

Mr Alexandr Zaytsev (Brookhaven National Laboratory (US))

17/03/2021, 18:25

IT Facilities & Business Continuity

The BNL Computing Facility Revitalization (CFR) project is aimed at repurposing the former National Synchrotron Light Source (NSLS-I) building (B725) located on BNL site as a new data center for Scientific Data and Computing Center (SDCC). The CFR project has finished the design phase in the first half of 2019 and then entered the construction phase in the second half of 2019 which is...

53. Supporting a new Light Source at Brookhaven

William Strecker-Kellogg (Brookhaven National Lab)

17/03/2021, 18:50

IT Facilities & Business Continuity

In this presentation we give an overview of SDCC's new support for the National Synchrotron Light Source 2 (NSLS-II) at Brookhaven National Lab. This includes the operational changes needed in order to adapt to the needs of BNL's photon science community.

52. IHEP Site Report

Ran Du

18/03/2021, 08:15

Site Reports

Site Report about Computing platform update and support systems development at IHEP during the past half year.

57. LoRaWAN and proximeters against Covid

Christoph Merscher (CERN)

18/03/2021, 08:30

Networking & Security

Network & Security

The SARS COV 2 virus, the cause of the better known COVID-19 disease, has greatly altered our personal and professional lives. Many people are now expected to work from home but this is not always possible and, in such cases, it is the responsibility of the employer to implement protective measures. One simple such measure is to require that people maintain a distance of 2 metres but this...

4. Enterprise Cyber-Physical Edge Virtualization Engine (EVE) Project

Mr Oleg Sadov (ITMO University)

18/03/2021, 08:55

Grid, Cloud & Virtualisation

The Linux Foundation’s FOSS project EVE Edge Virtualization Engine (www.lfedge.org/projects/eve/) is providing a flexible foundation for IoT edge deployments with choice of any hardware, application and cloud. The mission of the Project is to develop an open source project to provide a light-weight virtualization engine for IoT edge gateways and edge servers with built-in security. EVE acts as...

54. CERN Cloud Infrastructure status update

Patrycja Ewa Gorniak (Ministere des affaires etrangeres et europeennes (FR))

18/03/2021, 09:20

Grid, Cloud & Virtualisation

CERN's private OpenStack cloud offers more than 300,000 cores to over 3,400 users that can programatically access resources like compute, multiple storage types, baremetal, container clusters, and more.
CERN Cloud Team constantly works on improving these services while maintaining stability and availability that is critical for many services in IT and the experiment workflows.
This talk...

44. Setting up a PGPool II Cluster

Michael Hubner (University of Bonn (DE))

18/03/2021, 10:05

Basic IT Services

Databases have to fulfil a variety of requirements in an operational system. They should be highly-available, redundant, suffer minimal downtime during maintenance/upgrade works and be easily recoverable in case of critical system failure.

All of these requirements can be realized with a PGPool II cluster that uses PostgreSQL as backends. The high-availability of the backends are provided...

45. CERNphone update

German Cancio (CERN)

18/03/2021, 10:30

Basic IT Services

Since the last HEPiX, CERNphone has evolved from an internal pilot to a widely growing service with hundreds of users across the Organization. In this presentation, we will cover the current deployment of the mobile clients and the status of the upcoming desktop application. We will also describe advanced use cases such as team calls for handling piquet services and replacing shared office...

8. The CERN-Solid code investigation project

Jan Schill

18/03/2021, 16:00

Storage & Filesystems

Storage & File Systems

In this talk we shall introduce the Solid project, launched by Sir Tim Berners-Lee in 2016, as a set of open standards aiming to re-decentralize the Web and empower users’ control over their own data. Solid includes standards, missing from the original Web specifications, giving back to the users ownership of their data, private, shared, and public, choice on the storage where these data...

12. Anomaly Detection in the CERN Cloud Infrastructure

Domenico Giordano (CERN)

18/03/2021, 16:25

Grid, Cloud & Virtualisation

Anomaly Detection in the CERN Openstack Cloud is a challenging task due to the large scale of the computing infrastructure and the large volume of data to monitor.

The current solution to spot anomalous server machines in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure component. The...

3. FTS3: Data Movement Service in containers deployed in OKD

Lorena Lobato Pardavila (Fermi National Accelerator Lab. (US))

18/03/2021, 16:50

Grid, Cloud & Virtualisation

The File Transfer Service (FTS3) is a data movement service developed at CERN which is used to distribute the majority of the Large Hadron Collider's data across the Worldwide LHC Computing Grid (WLCG) infrastructure. At Fermilab, we have deployed a couple of FTS3 instances for Intensity Frontier experiments (e.g. DUNE) to transfer data in America and Europe, using a container-based strategy....

38. Shoal - a dynamic squid cache publishing and advertising tool

Dr Marcus Ebert (University of Victoria)

18/03/2021, 17:35

Grid, Cloud & Virtualisation

Shoal is a squid cache publishing and advertising tool designed to work in
fast changing environments, consistent of three components - the
shoal-server, the shoal-agent, and the shoal-client.
The purpose of shoal is to have a continually updated list of squid
caches. Each squid runs shoal-agent which uses AMQP messages to
publish its existence and the load of the squid to the...

35. CERN Authentication and Authorization

Paolo Tedesco (CERN)

18/03/2021, 18:00

Basic IT Services

CERN is redesigning its authentication and authorization infrastructures around open source software, such as Keycloak for the Single Sign On service and FreeIPA for the LDAP backend.
The project, which is part of the larger CERN MALT initiative, was first introduced at the HEPiX Autumn/Fall 2018 Workshop.
This talk will provide an overview of the new services, which are now in a production...

32. A Unified approach towards Multi-factor Authentication (MFA)

Masood Zaran (Brookhaven National Labratory)

18/03/2021, 18:25

Basic IT Services

With more applications and services deployed in BNL SDCC that rely on authentication services, adoption of Multi-factor Authentication (MFA) became inevitable. While web applications can be protected by Keycloak (a open source Single sign-on solution directed by Red Hat) with its MFA feature, other service components within the facility rely on FreeIPA (an open source identity management...

56. Workshop wrap-up

Tony Wong (Brookhaven National Laboratory)

18/03/2021, 18:50

Choose timezone

HEPiX Spring 2021 online Workshop

Organisers