HTCondor Workshop Autumn 2024 in Amsterdam

Name: HTCondor Workshop Autumn 2024 in Amsterdam
Start: 2024-09-23T09:00:00+02:00
End: 2024-09-27T14:00:00+02:00
Location: Nikhef

23 Sept 2024, 09:00 → 27 Sept 2024, 14:00 Europe/Amsterdam

Colloquium room (Nikhef)

Colloquium room

Nikhef

Nikhef Science Park 105 1098 XG Amsterdam

Helge Meinhard (CERN), Todd Tannenbaum (University of Wisconsin Madison (US)), Chris Brew (Science and Technology Facilities Council STFC (GB)), Christoph Beyer, Mary Hester

Description

We are very pleased to announce that the 2024 European HTCondor Workshop will be held from Tuesday 24th September to Friday 28th September, at Nikhef in Amsterdam, The Netherlands.

The meeting will start on Tuesday morning and run until lunchtime on the Friday.

The workshop will be an excellent occasion for learning from the experts (the developers!) about HTCondor, exchanging with your colleagues about experiences and plans and providing your feedback to the experts.

The HTCondor Compute Entrypoint (CE) will be covered as well as with token authentication (currently a hot topic), along with general use and administration of HTCondor.

Participation is open to all organisations (including companies) and persons interested in HTCondor (and by no means restricted to particle physics and/or academia!) If you know potentially interested persons, don't hesitate to make them aware of this opportunity.

The workshop will cover both using and administering HTCondor; topics will be chosen to best match participants' interests.

We would very much like to know about your use of HTCondor, in you project, your experience and your plans. Hence you are warmly encouraged to propose a short presentation.

In addition, we would like to thank our Platinum and Gold sponsors for their support with this event!

Platinum Sponsor

Gold Sponsors

If you have any questions, please contact hepix-2024condorworkshop-support@hepix.org.

We are looking forward to a rich, productive workshop.

Chris Brew (STFC - RAL) and Christoph Beyer (DESY), Co-Chairs of organising committee.

Mary Hester (Nikhef), Chair of the Local Organising Committee

Todd Tannenbaum, HTCondor Technical Lead, U Wisconsin, Madison, USA

Support

hepix-2024condorworkshop-support@hepix.org

Participants

68 View full list

Monday 23 September
- INTERNAL BOARD MEETING: Program comittee meeting for final preparation Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
Tuesday 24 September
- Registration Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session: Introductions and Welcomes Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 1
    
    Welcome, Introduction and Housekeeping
    
    Speakers: Christoph Beyer, Mary Hester
    
    welcome_slides.odp
    
    welcome_slides.pdf
  - 2
    
    Nikhef Welcome
  - 3
    
    Philosophy and Architecture: What the Manual Won't tell You
    
    Philosophy and Architecture: What the Manual Won't tell You
    
    Speaker: MIRON LIVNY
  - 4
    
    Round the room introductions
    
    Who are you, where are you from and what do you hope to get out of the workshop?
- 10:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 5
    
    Troubleshooting: What to do when things go wrong
    
    Troubleshooting: What to do when things go wrong
    
    Speaker: Andrew Owen
    
    Slides
  - 6
    
    Practical considerations for GPU Jobs
    
    Practical considerations for GPU Jobs
    
    Speaker: Andrew Owen
    
    Slides
- 12:30
  
  Lunch Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 7
    
    Abstracting Accelerators Away
    
    Currently more and more frameworks appear to perform offloaded compute to accelerators, or accelerating ML/AI workloads using CPU accelerators or GPUs. However right now the user it self still needs to figure out or decide how and what is the best execution library or acceleration system to execute there workloads.
    
    How can we model this abstraction the best for htcondor so for our users the overhead to use the acceleration?
    
    Speaker: Emily Kooistra
  - 8
    
    An ATLAS researcher's experience with HTCondor.
    
    A new users experience of switching to HTCondor
    
    Speaker: Zef Wolffs (Nikhef National institute for subatomic physics (NL))
    
    HTCondor presentation.pdf
  - 9
    
    Monte Carlo simulations of extensive air showers at NIKHEF
    
    This presentation will show how the Comic Rays group at Nikhef is using HTCondor in their analysis workflows on the local pool.
    
    Speaker: Kevin Cheminant (Radboud University / NIKHEF)
    
    Monte Carlo simulations and extensive air showers
  - 10
    
    HTCondor + Nikhef - A History of Productive Collaboration
    
    Speaker: MIRON LIVNY
- 15:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Town Hall Discussion Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Social Event (Reception): Welcome Reception at Poesiat & Kater
  
  Brouwerij Poesiat & Kater
  Polderweg 648
  1093 KP Amsterdam
  https://poesiatenkater.nl/
  
  https://osm.org/go/0E6VHHtKg?node=4815845616
  
  openstreetmap link to location (Brouwerij Poesiat & Kater Polderweg 648 1093 KP Amsterdam https://poesiatenkater.nl/ )
Wednesday 25 September
- Workshop Session: Your Data and Condor Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 11
    
    Dealing with sources of Data: Choices and the Pros/Cons
    
    Dealing with sources of Data: Choices and the Pros/Cons
    
    Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
    
    DealingData - HTC-EU-24.pdf
  - 12
    
    Managing Storage at the EP
    
    Managing Storage at the EP
    
    Speaker: Cole Bollig
    
    Managing-EP-disk-HTC-EU-24.pdf
    
    Managing-EP-disk-HTC-EU-24.pptx
  - 13
    
    NetApp DataOps Toolkit for data management
    
    The NetApp DataOps Toolkit is a python library that makes it easy for developers, data scientists and data engineers to perform various data management tasks. These tasks include provisioning new data volumes or developing workspace almost instantaneously. It improves flexibility in development’s environment management. In this presentation, we will go over some examples and showcase how these libraries can be leveraged for different data management use cases.
    
    Speaker: Didier Gava (NetApp)
    
    DataOps-HTCondor.pdf
- 10:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session: Your Data and Condor cont. Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 14
    
    Storage Solutions with AI workloads
    
    Various AI workloads, such as Deep Learning, Machine Learning, Generative AI or Retrieval Augmented Generation, require capacity, compute power or data transfer performance. This presentation will show how simple a hardware / Software stack solution deployment, can leverage and/or become part of an AI infrastructure based on Ansible scripts. In addition, I will discuss two use cases, one on video surveillance and the second on real-time language processing, powered by an AI infrastructure setup.
    
    Speaker: Didier Gava
    
    SolutionAI-HTCondor.pdf
  - 15
    
    CHTC Vision: Compute and Data Together
    
    CHTC Vision: Compute and Data Together
    
    Speaker: MIRON LIVNY
  - 16
    
    Pelican Intro
    
    Pelican Intro
    
    Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
    
    PelicanIntro - HTC-EU-24.pdf
  - 17
    
    PANEL and Discussion - Pelican and Condor: Flying Together, Birds of a Feather, Don't drop your data!
    
    PANEL and Discussion - Pelican and Condor: Flying Together, Birds of a Feather, Don't drop your data!
    
    Speakers: Brian Paul Bockelman (University of Wisconsin Madison (US)), MIRON LIVNY, Todd Tannenbaum
- 12:30
  
  Lunch Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 18
    
    Dynamic resource integration with COBalD/TARDIS
    
    With the continuing growth of data volumes and computational demands, compute-intensive sciences rely on large-scale, diverse computing resources for running data processing, analysis tasks, and simulation workflows.
    These computing resources are often made available to research groups by different resource providers resulting in a heterogeneous infrastructure.
    To make efficient use of those resources, we are developing COBald/TARDIS, a resource management system for dynamic and transparent integration.
    
    COBalD/TARDIS provides an abstraction layer of resource pools and sites and takes care of scheduling and requesting those resources, independent of their sites local resource management systems.
    Through the use of adapters, COBalD/TARDIS is able to interface with a range of resource providers, including OpenStack, Kubernetes, and others, as well as support different overlay batch systems, with current implementations for HTCondor and SLURM.
    In this contribution we present the general concepts of COBalD/TARDIS, several setups, with a focus on those using HTCondor, in different university groups, as well as WLCG sites.
    
    Speaker: Florian Von Cube (KIT - Karlsruhe Institute of Technology (DE))
    
    COBalDTARDIS_HTCondor.pdf
  - 19
    
    Adapting Hough Analisys workflow to run on IGWN resources
    
    The computing workflow of the Virgo Rome Group for the CW search based on Hough Analisys has been performed for several years using storage and computing resources mainly provisioned by INFN-CNAF and strictly tied with its specific infrastructure. Starting with O4a, the workflow has been adapted to be more general and to integrate with computing centers in the IGWN community. We discuss our work toward this integration, the encountered problems, our solutions and the further steps ahead.
    
    Speaker: Stefano Dal Pra (Universita e INFN, Bologna (IT))
    
    HTC_ws_2024.pdf
  - 20
    
    Kubenettes ↔ HTC
    
    Operating HTCondor with kubenettes
    
    Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
    
    Kubernetes and HTCSS - HTC-EU-24.pdf
  - 21
    Fun with Condor Print Formats
    
    During the 20 years history of the Torque batch system at Nikhef, we constructed several command line tools providing various overviews of what was going on in the system. An example: a tool that could tell us "what are the 20 most recently started jobs?"
    
    mrstarts | tail -20
    
    With HTCondor we wanted the same kind of overviews. Much of this can be accomplished using the HTCondor "print formats" associated with the condor_q, condor_history, and condor_status commands. In this talk I'll present and discuss some examples, advantages and disadvantages of the approach, and along the way present some HTCondor mysteries we haven't solved.
    
    Speaker: Jeff Templon (Nikhef National institute for subatomic physics (NL))
    
    HTCondor-printformats-2024.pdf
- 15:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 22
    
    HTC from the user perspective
    
    HTC from the user perspective - to be chosen from former material
    
    Speaker: Cole Bollig
    
    HTC-User-Perspective-HTC-EU-24.pdf
    
    HTC-User-Perspective-HTC-EU-24.pptx
  - 23
    
    Exploring Job Histories with ElasticSearch and HTCondor AdStash
    
    Exploring Job Histories with ElasticSearch and HTCondor AdStash
    
    Speaker: Todd Tannenbaum
    
    Exploring Job Histories with ElasticSearch and HTCondor AdStash.pdf
    
    Exploring Job Histories with ElasticSearch and HTCondor AdStash.pptx
  - 24
    
    HTCondor System Administration Introduction
    
    Quick overview of HTCondor for system administrators
    
    Speaker: Todd Tannenbaum
    
    HTCondor System Administration Introduction.pdf
    
    HTCondor System Administration Introduction.pptx
- Office Hours Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  
  Arrange to discuss your questions with members of the Condor Team
Thursday 26 September
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 25
    
    DAGman: I didn't know it could do that!
    
    DAGman: I didn't know it could do that!
    
    Speaker: Cole Bollig
    
    I-didnt-know-dagman-could-do-that-HTC-EU-24.pdf
    
    I-didnt-know-dagman-could-do-that-HTC-EU-24.pptx
  - 26
    
    Final project update
    
    This year has been eventful for our research lab, New hardware that brought along a host of challenges, we will share network, architecture and recent challenges that we are facing.
    It's all about scale.
    
    Speaker: David Handelman
  - 27
    
    Integrating an IDE with HTCondor
    
    Graphical code editors such as Visual Studio Code (VS Code) have gained a lot of momentum in the last years among young researchers. To ease their workflows, we have developed a VS Code entry point to harness the resources of an HTC cluster within their IDE.
    
    This entry point allows users to have a "desktop-like" experience within VS Code when editing and testing their code while working in batch job environments. Furthermore, VS Code extensions such as Jupyter notebooks and Julia packages can directly leverage cluster resources.
    
    In this talk we will explain the use case of this entry point, how we implemented it and show some of the struggles we encountered along the way. The developed solution can also scale out to federated HTCondor pools.
    
    Speaker: Michael Hubner (University of Bonn (DE))
    
    HTCondor_Bonn_IDE.pdf
- 10:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 28
    
    The new HTCSS Python API: Python Bindings Version 2
    
    The new HTCSS Python API: Python Bindings Version 2
    
    Speaker: Cole Bollig
    
    PythonBindingsVersion2.pdf
    
    PythonBindingsVersion2.pptx
  - 29
    
    HTCondor: Whats New / Whats coming up
    
    HTCondor: Whats New / Whats coming up
    
    Speaker: Todd Tannenbaum
    
    WhatsNewAmsterdam2024.pdf
    
    WhatsNewAmsterdam2024.pptx
  - 30
    
    Moving from Torque to HTCondor on the local cluster
    
    Nikhef operates a local compute facility of around 6k cores. For the last two decades, Torque has been the batch system of choice on this cluster.
    This year the system has been replaced with HTCondor; in this talk we share some of the concerns, design choices and experiences of the transition from the operator's perspective.
    
    Speaker: Mr Dennis van Dok (Nikhef)
    
    New Horizons for Stoomboot: from Torque to HTCondor
    
    New horizons for Stoomboot_ from Torque to HTCondor.pdf
- 12:30
  
  Lunch Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 31
    
    Opportunities and Challenges Courtesy Linux Cgroups Version 2
    
    Opportunities and Challenges Courtesy Linux Cgroups Version 2
    
    Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
    
    CgroupsV2 - HTC-EU-24.pdf
  - 32
    
    AMD INSTINCT GPU CAPABILITY AND CAPACITY AT SCALE
    
    The adoption of AMD Instinct™ GPU accelerators in several of the major high-performance computing sites is a reality today and we’d like to share the pathway that lead us here. We’ll focus on characteristics of the hardware and ROCm software ecosystem, and how they were tuned to match the required compute density and programmability to make this adoption successful, from the discrete GPU to the supercomputer that tightly integrate massive amounts of these devices.
    
    Speaker: Samuel Antao (AMD)
    
    HTCondor-AMD-Talk.pdf
  - 33
    
    GPUs in the Grid
    
    In this presentation we will go over GPU deployment at the NL SARA-MATRIX Grid site. An overview of the setup is shown, followed by some rudimentary performance numbers. Finally, the user adoption and how the GPU is used is discussed.
    
    Speaker: Dr Lodewijk Nauta (SURF)
    
    HTCondor-workshop-GPUs_20240926.pdf
  - 34
    
    Lenovo’s Cooler approach to HTC Computing
    
    Breakthroughs in computing systems have made it possible to tackle immense obstacles in simulation environments. As a result, our understanding of the world and universe is advancing at an exponential rate. Supercomputers are now used everywhere—from car and airplane design, oil field exploration, and financial risk assessment, to genome mapping and weather forecasting.
    
    Lenovo’s High-Performance Computing (HPC) technology offers substantial benefits for High Transaction Computing (HTC) by providing the necessary computational power and efficiency to handle large volumes of transactions. Lenovo’s HPC solutions, built on advanced hardware such as the ThinkSystem and ThinkAgile series, deliver exceptional processing speeds and reliability. These systems are designed to optimize data throughput and minimize latency, which are critical factors in transaction-heavy environments like financial services, e-commerce, and telecommunications. The integration of Lenovo’s HPC technology into HTC environments enhances the ability to process transactions in real-time, ensuring rapid and accurate data handling. This capability is crucial for maintaining competitive advantage and operational efficiency in industries where transaction speed and accuracy are paramount. Additionally, Lenovo’s focus on energy-efficient computing ensures that these high-performance systems are also sustainable, aligning with broader environmental goals.
    
    By leveraging Lenovo’s HPC technology, organizations can achieve significant improvements in transaction processing capabilities, leading to better performance, scalability, and overall system resilience. According to TOP500.org, Lenovo is the world's #1 supercomputer provider, including some of the most sophisticated supercomputers ever built. With over a decade of liquid-cooling expertise and more than 40 patents, Lenovo leverages experience in large-scale supercomputing and AI to help organizations deploy high-performance AI at any scale.
    
    Speaker: Mr Rick Koopman
    
    Rick Koopman.jpeg
- 15:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Lightning Talks/Show your toolbox Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Social Event (Dinner): House of Bird Diemerbos
  
  https://osm.org/go/0E6U8W6og
  
  openstreetmap link (to House of Bird, Diemerbospad 1a, 1111 PZ Diemen, Niederlande)
Friday 27 September
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 35
    
    WLCG Token Transition Update (incl the illustrious return of x509)
    
    WLCG Token Transition Update (incl the illustrious return of x509)
    
    Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
    
    TokenTransition - HTC-EU-24.pdf
  - 36
    
    Practical experience with an interactive-first approach to leverage HTC resources
    
    Development and execution of scientific code requires increasingly complex software stacks and specialized resources such as machines with huge system memory or GPUs. Such resources are present in HTC/HPC clusters and used for batch processing since decades,but users struggle with adapting their software stacks and their development workflows to those dedicated resources. Hence, it is crucial to enable interactive use with a low-threshold user experience, i.e. offering an SSH-like experience to enter development environments or start JupyterLab sessions from a web browser.
    
    Turning some knobs, HTCondor unlocks these interactive use cases of HTC and HPC resources, leveraging the resource control functionality of a workload manager, wrapping execution within unprivileged containers and even enabling the use of federated resources crossing network boundaries without loss of security.
    
    This talk presents the positive experience with an interactive-first approach, hiding the complexities of containers and different operating systems from the users, enabling them to use HTC resources in an SSH-like fashion and with their JupyterLab environments. It also provides a short outlook on scaling this approach to a federated infrastructure.
    
    Speaker: Oliver Freyermuth (University of Bonn (DE))
    
    HTCondor_Bonn_Interactive.pdf
  - 37
    
    HTCondor setup @ ORNL, an ALICE T2 site
    
    ALICE experiment at CERN runs a distributed computing model and it is part of the Worldwide LHC Computing Grid (WLCG). WLCG uses a tiered distributed grid model. As part of the ALICE experiment’s computing grid we run two Tier2 (T2) sites in the US, at Oak Ridge National Laboratory and Lawrence Berkeley National Laboratory. Computing resource usage and delivery are being accounted through OSG via GRATIA probes. This information is then forwarded to the WLCG. With the OSG software update and deprecation of some GRATIA probes we had to update the setup for the OSG accounting. To do so we have recently started to move our existing setup to HTCondor based workflow and new GRATIA accounting. I will present the setup for our T2 sites and HTCondor configuration escapade.
    
    Speaker: Irakli Chakaberia (Lawrence Berkeley National Lab. (US))
    
    20240927 - HTC@ORNL.pdf
    
    20240927 - HTC@ORNL.pdf
    
    20240927 - HTC@ORNL.pptx
  - 38
    
    Implementing OSDF Cache in SURF - MS4 Service
    
    In this presentation there will be a brief mention of the environment that hosts the OSDF Cache, the setup and suitable software for MS4 service. The presentation will lay out in a bit more depth the process of installing the OSDF cache and the challenges that arose during the installation.
    
    Speaker: Jasmin Colo
    
    OSG-CACHE implementation at SURF.pdf
    
    OSG-CACHE implementation at SURF.pptx
- 10:30
  
  Coffee Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
- Workshop Session Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam
  - 39
    
    HPC use case through PIC
    
    In this contribution, I will present an HPC use case facilitated through gateways deployed at PIC. The selected HPC resource is the Barcelona Supercomputing Center, where we encountered some challenges, particularly in the CMS case, which required meticulous and complex work. We had to implement new developments in HTCondor, specifically enabling communication through a shared file system. This contribution will detail the setup process and the scale we were able to achieve so far.
    
    Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    
    20240927_HTCondorWS_BSC_PIC_JFlix.pdf
  - 40
    
    HTCondor in Einstein Telescope
    
    The Einstein Telescope (ET) is currently in the early development phase
    for its computing infrastructure. At present, the only officially
    provided service is the distribution of data for Mock Data Challenges
    (using the Open Science Data Federation + CVMFS-for-data), with GitLab
    used for code management. While the data distribution infrastructure is
    expected to be managed by a Data Lake using Rucio, the specifics of the
    data processing infrastructure and tools remain undefined. This
    exploratory phase allows for a detailed evaluation of different solutions.
    Drawing from the experiences of 2nd-generation gravitational wave
    experiments LIGO and Virgo, which began with modest computational needs
    and expanded into distributed computing models using HTCondor, ET aims
    to build upon these foundations. LIGO and Virgo adopted, for their
    offline data analyses, the LHC grid computing model through a common
    computing infrastructure called IGWN (International Gravitational-Wave
    Observatory Network), incorporating systems like glideinWMS, which works
    on top of HTCondor, to handle high-throughput computing (HTC) tasks.
    Despite this, challenges such as the reliance on shared file systems
    have limited the migration to grid-based workflows, with only 20% of
    jobs currently running on the IGWN grid.
    For ET, the plan is to adapt and evolve from the IGWN grid computing
    model, making sure workflows are grid-compatible. This includes
    exploring Snakemake, a framework for reproducible data analysis, to
    complement HTCondor. Snakemake offers the ability to run jobs on diverse
    computing resources, including grid, Slurm clusters, and cloud-based
    infrastructures. This approach aims to ensure flexibility, scalability,
    and reproducibility in ET’s data processing workflows, while overcoming
    past limitations.
    
    Speaker: Luca Tabasso
    
    HTCondor in Einstein Telescope.pdf
  - 41
    
    Transitioning the CMS pools to ALMA9
    
    The Submission Infrastructure team of the CMS experiment at the LHC operates several HTCondor pools, comprising more than 500k CPU cores on average, for the experiment's different user groups. The jobs running in those pools include crucial experiment data reconstruction, physics simulation and user analysis. The computing centres providing the resources are distributed around the world and dynamically added to the pools on demand.
    
    Uninterrupted operation of those pools is critical to avoid losing valuable physics data and ensure the completion of computing tasks for physics analyses. With the announcement of the end-of-life of CentOS 7, the CMS collaboration decided to transition their infrastructure, running essential services for the successful operation of the experiment, to ALMA 9.
    
    In this contribution, we outline CMS's federated HTCondor pools and share our experiences of transitioning the infrastructure from CentOS 7 to ALMA 9, while keeping the system operational.
    
    Speaker: Florian Von Cube (KIT - Karlsruhe Institute of Technology (DE))
    
    HTC Week 2024 – Transitioning the CMS pools to ALMA9.pdf
  - 42
    
    Heterogeneous Tier2 Cluster and Power Efficiency Studies at ScotGrid Glasgow
    
    With the latest addition of 4k ARM cores, the ScotGrid Glasgow facility is a pioneering example of a heterogeneous WLCG Tier2 site. The new hardware has enabled large-scale testing by experiments and detailed investigations into ARM performance in a production environment.
    
    I will present an overview of our computing cluster, which uses HTCondor as the batch system combined with ARC-CE as the front-end for job submission, authentication, and user mapping, with particular emphasis on the dual queue management. I will also touch on our monitoring and central logging system, built on Prometheus, Loki, and Grafana, and describe the custom scripts we use to extract job information from HTCondor and pass it to the node_exporter collector.
    
    Moreover, I will highlight our research on power efficiency in HEP computing, showing the benchmarks and tools we use to measure and analyze power data. In particular, I will present a new figure-of-merit designed to characterize power usage during the execution of the HEP-Score benchmark, along with an updated performance-per-watt comparison extended to the latest x86 and ARM CPUs (Ampere Altra Q80 and M80, NVidia Grace, and recent AMD EPYC chips). Within this context, we introduce a Frequency Scan methodology to better characterize performance/watt trade-offs.
    
    Speaker: Emanuele Simili
    
    ScotGrid_HTCws.pdf
    
    ScotGrid_HTCws.pptx
  - 43
    
    Workshop Wrap-Up and Goodbye
    
    Speaker: Chris Brew (Science and Technology Facilities Council STFC (GB))
    
    HTCondor_Workshop_WrapUp.pdf
- 12:30
  
  Lunch Colloquium room
  
  Colloquium room
  
  Nikhef
  
  Nikhef Science Park 105 1098 XG Amsterdam

Choose timezone

HTCondor Workshop Autumn 2024 in Amsterdam

Colloquium room

Nikhef

Platinum Sponsor

Gold Sponsors

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef

Colloquium room

Nikhef