Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

HTCondor Workshop Autumn 2023

Europe/Paris
Auditorium Joliot Curie (IJCLab)

Auditorium Joliot Curie

IJCLab

Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
Helge Meinhard (CERN), Todd Tannenbaum (University of Wisconsin Madison (US)), Chris Brew (Science and Technology Facilities Council STFC (GB)), Christoph Beyer, Michel Jouvin (Université Paris-Saclay (FR))
Description

Dear HTCondor Community,

See last update about the whereabouts here: 

https://indico.cern.ch/event/1274213/page/29408-accommodation-local-transport

We are very pleased to announce that the 2023 European HTCondor Workshop will be held from Tuesday 19th September to Friday 22nd September, at IJCLab in Orsay, France.

To allow ease of travel to and from the venue the meeting will start at on Tuesday morning and run until lunchtime on the Friday.

Registration is now open.

There will be a small workshop fee of €160 to cover lunches, teas and coffees, and a social dinner on the Thursday evening.

The workshop will be an excellent occasion for learning from the experts (the developers!) about HTCondor, exchanging with your colleagues about experiences and plans and providing your feedback to the experts. 

The HTCondor Compute Entrypoint (CE) will be covered as well as with token authentication (currently a hot topic), along with general use and administration of HTCondor.

Participation is open to all organisations (including companies) and persons interested in HTCondor (and by no means restricted to particle physics and/or academia!) If you know potentially interested persons, don't hesitate to make them aware of this opportunity.

The workshop will cover both using and administering HTCondor; topics will be chosen to best match participants' interests.

We would very much like to know about your use of HTCondor, in you project, your experience and your plans. Hence you are warmly encouraged to propose a short presentation.

If you have any questions, please contact hepix-2023condorworkshop-support@hepix.org.

We are looking forward to a rich, productive workshop.

Chris Brew (STFC - RAL) and Christoph Beyer (DESY), Co-Chairs of organising committee.

Michel Jouvin (IJCLab), Chair of the Local Organising Committee 

Todd Tannenbaum, HTCondor Technical Lead, U Wisconsin, Madison, USA

 

Participants
  • Andres Jorge Tanasijczuk
  • Brian Bockelman
  • Chris Brew
  • Christoph Beyer
  • David Rebatto
  • Dennis van Dok
  • Emmanouil Vamvakopoulos
  • Francesco Murdaca
  • Francesco Prelz
  • Gregory Thain
  • Helge Meinhard
  • James Frey
  • Marco Mascheroni
  • Marco Sadocco
  • Matthew West
  • Michel Jouvin
  • MIRON LIVNY
  • Mischa Sallé
  • R. Florian von Cube
  • Stefano Dal Pra
  • Thomas Hartmann
  • Todd Tannenbaum
  • +18
    • board meeting 100-A015

      100-A015

      IJCLab

      final board meeting for workshop preparation

      Convener: Christoph Beyer
    • 1
      Welcome Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      Speaker: Christoph Beyer
    • 2
      Welcome and housekeeping Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      Speaker: Michel Jouvin (Université Paris-Saclay (FR))
    • 3
      IJCLab in a nutshell Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      Speakers: Valerie CHAMBERT, Valerie Chambert (Universite de Paris-Sud 11 (FR))
    • Workshop Session Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 10:30
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Data and Security Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 12:30
      Lunch Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Data and Security Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 15:30
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Lightning Talks Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 12
      Welcome drink La Part des Anges

      La Part des Anges

      57 rue Charles de Gaulle 91440 Bures sur Yvette

      La Part des Anges is an excellent and friendly Wine Bar.

      Tapas/apetizers offered, drinks at your charge
      Possibility to have a diner afterward on an individual basis

      https://osm.org/go/0BOQg93_M?node=6574870872

    • Jobs and Access Points Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 15
      Visit of ThomX and SUPRATECH facilities Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance

      2 groups of 15

    • 10:30
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Jobs and Access Points Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 12:30
      Lunch Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Jobs and Access Points Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • HTCondor User Presentations Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      • 21
        News from the DESY Clusters

        With the ever changing IT and political landscapes also the HTC clusters at DESY are evolving with these changes. We will give an overview and status, what over the past year is new and where we plan to go with our compute clusters. Efficient utilization of resources has become even more pressing following recent geopolitical turmoil and with climate change becoming even more pressing, for which we are optimizing the energy profile of our user Condor cluster. Also a Cobald/Tardis Condor meta cluster is under development with the idea to use it for backfilling of untapped idle resources.

        Speaker: Thomas Hartmann (Deutsches Elektronen-Synchrotron (DE))
      • 22
        Deployment of HTCondor at GRIF

        In this communication, we are going to present the multiple HTCondor instances that have been deployed at GRIF for several years. GRIF is a distributed Tier-2 WLCG/EGI site made of four (4) different subsites (IJCLab, IRFU, LLR, LPNHE), in different locations of the Paris region. The worst network latency between the subsites is within 2-4 msec with three (3) of them connected with a 100Gbit/sec connection. In particular, a distributed HTCondor pool, with HTCondor-CE gateways, gives unified access to the IJCLab and LLR resources. IRFU and LPHNE are running independent condor pools based on an ARC-CE and HTCondor-CE gateways which are providing access to the computing resources of those sites. Future intentions and plans about the incorporation and usage of Cloud computing resources with or without Kurbenetes infrastructure will be discuss.

        Speaker: Dr Emmanouil Vamvakopoulos (Université Paris-Saclay (FR))
    • 15:40
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • HTCondor User Presentations Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 24
      Office Hours Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Workshop Session Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 27
      Visit of ThomX and SUPRATECH facilities Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance

      By groups of 15

    • 10:35
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • HTCondor User Presentations Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      • 28
        Reaching new scales in the CMS Global pool

        The computing resource needs of LHC experiments, such as CMS, are expected to continue growing significantly over the next decade, during the Run 3 and especially the HL-LHC era. The SI team manages a set of federated HTCondor pools, currently aggregating around 400k CPU cores distributed worldwide, supporting the simultaneous execution of over 200k CMS computing tasks. In order to detect and overcome performance degradation driven by scalability barriers, the SI team regularly runs tests to explore the scalability reach of our infrastructure. In this contribution, we will report on the test results for potential scalability limitations of our infrastructure.

        Speaker: Marco Mascheroni (Univ. of California San Diego (US))
      • 29
        The Virgo Data Quality Reports: an HTCondor automated framework to vet gravitational-wave candidates

        Transient gravitational-wave (GW) signals have been discovered since 2015 by the LVK global network of giant, ground-based, interferometric detectors. It currently includes four instruments: the LIGO Hanford and LIGO Livingston detectors located in the USA, the Virgo detector in Italy – hosted by the European Gravitational Observatory (EGO) –, and the KAGRA detector in Japan. A key component of the search for GWs is the broadcast of a low-latency public alert to the astronomical community, each time a significant GW candidate is identified by the pipelines which jointly analyze in real time the data from all the running detectors.

        These alerts require a quick, yet accurate, vetting of the quality of the corresponding data. The main input for such a decision – to confirm the alert or to retract it if the candidate is found not to be of astronomical origin – comes from the Data Quality Report (DQR), a set of predefined checks which is triggered automatically when a new candidate has been identified, and which runs on an HTCondor farm. In this contribution, we focus on the Virgo DQR framework which has been developed jointly by the Virgo Collaboration and the EGO IT department, following standards defined at the LIGO-Virgo Collaborations level in 2018-2019. It allows vetting the data acquired by the Virgo detector. After a short description of the EGO HTCondor farm and of the various software which run on it during a joint data-taking period of the LVK network, we will describe the Virgo DQR; the way a significant GW candidate leads to the generation of a global HTCondor DAG (running about 40 different checks in parallel, for a total of roughly 120 jobs); its main inputs and outputs; its performance; and finally the live monitoring system which has been developed to parse every minute the dag.dagman.out DAG log file. Similar but independent DQR frameworks are running at various LIGO computing centers to vet the LIGO Hanford and LIGO Livingston data.

        Speaker: Dr Nicolas Arnaud (IJCLab (Université Paris-Saclay and CNRS/IN2P3))
      • 30
        Energy savings by power modulation of a HTC pool

        Energy prices are reaching an all time high in europe and we are speculating at the same time to get a more elaborated billing model that calculates a real time electricity price depending on the availability of electricity on the market. As the price is predictable roughly 30h in forehand it seems to be desirable to not only know more about the energy consumption of a condor pool but also to drive the energy consumption of the pool along a timeline that is given by said prices ...

        Speaker: Christoph Beyer
    • 12:30
      Lunch Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • HTCondor User Presentations Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      • 31
        Multi-tenancy HTCondor for the European Weather Cloud

        The European Weather Cloud (EWC) is the cloud-based collaboration platform for meteorological application development and operations in Europe and enables the digital transformation of the European Meteorological Infrastructure. Among the services to be provided to the meteorological community, batch processing has been requested for several type of analysis using satellite data. HTCondor has been selected as the batch system for data processing and it will be provided to the users through the EWC. In this talk you will discover what is EWC and how HTCondor was customized to work in the EWC in order to facilitate utilization/share of resources across multi-tenancies and to run jobs through HTCondor. In particular, you will see how we allow users to easily join the common pool of resources that can be used by the community, how nodes are automatically deployed, how we preserve the ability to access internal networks of different tenancies via VPN, how users can rely on multi tenancies resources, running jobs securely and with close to the data.

        Speaker: Francesco Murdaca (EUMETSAT)
      • 32
        Just-in-time matching of workflows for the DUNE experiment

        The DUNE experiment is a large international particle physics
        project which is currently under construction at Fermilab in
        Illinois and SURF in South Dakota, with prototypes at CERN. The
        experiment relies on Fermilab’s investment in HTCondor and
        GlideInWMS, and on the LArSoft ecosystem of applications
        software. Initially data management was done with Fermilab’s SAM
        system but this is gradually being replaced by other components.
        MetaCat and Rucio are now in use as DUNE’s file metadata and
        replica catalogues, and DUNE has developed a just-in-time
        workflow management system, justIN, to replace the SAM workflow
        functionality and provide higher level management of processing
        requests which are carried out in GlideInWMS/HTCondor jobs. The
        new system’s philosophy of matching tasks to resources as they
        become available will be described. justIN provides a workflow
        submission interface and then submits suitable jobs to the DUNE
        HTCondor pool. Jobs call back to justIN when they eventually
        start at sites, and a decision is made at that point about what
        workflows to carry out on that machine and which files to
        process. These decisions are based on the available memory,
        processors, maximum local job duration, and the availability of
        nearby files which are still to be processed as part of the
        current workflows. This just-in-time approach is able to take
        unplanned downtimes at sites and storages into account
        immediately, as well as higher level changes such as
        fluctuations in the demand from other user communities. This
        system was validated during the DUNE Data Challenge 4 in late
        2022 and has been used in the simulation campaigns of 2023.
        justIN uses token information obtained from CILogon with users
        authenticating with the Fermilab Identity Provider service. This
        in turn allows users to authenticate to the justIN web dashboard
        or to use the justIN command line tool to launch and manage
        workflows. To enforce DUNE policies on the use of Rucio-managed
        storage, justIN jobs carry out data write operations on behalf
        of user supplied scripts and code, which are isolated from
        higher level credentials by justIN’s use of
        Singularity/Apptainer containers. Further work to increase the
        integration of justiN and the new dedicated DUNE HTCondor pool
        will be described.

        Speaker: Andrew McNab (University of Manchester)
      • 33
        The journey to modern storage.

        About a year a go we had a project that needs 20PB of low performance storage 20GBs throughout. We ended up buying 40PB with 2.7TBs throughout.
        I would lime to share with you the journey from open source storage to commercial storage.

        Choosing the technology, creating network to support this throughout, benchamrking, understand the needs and at the end real world vs the benchmark.

        I hope 20 minutes will be enough

        Speaker: David Handelman
    • Equality,Diversity, Inclusion & Accessibility Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      • 34
        Thoughts on EDIA in dHTC Cyber-Infrastructure

        Making space for new groups within our existing dHTC infrastructure community is essential for "democratizing computation." This talk calls on participants to reflect on why so many research computing conferences have such homogeneous attendees. Some are long-term structural issues beyond our scope, but this should not absolve us from taking proactive measures to rectify this problem where possible.

        Speaker: Matthew West
      • 35
        Discussion on EDIA
    • 15:30
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Show your Toolbox Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 36
      Social Diner Bouillon Racine

      Bouillon Racine

      3 rue Racine 75006 Paris
    • Pools and Execution Points Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • 10:30
      Coffee Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
    • Pools and Execution Points Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      • 41
        Advanced debugging with eBPF and Linux perf tools
        Speaker: Gregory Thain
      • 42
        Pools and Execution Points Wrap Up
        Speaker: MIRON LIVNY
    • 43
      Wrap Up and Goodbye Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance
      Speaker: Chris Brew (Science and Technology Facilities Council STFC (GB))
    • 12:30
      Lunch Auditorium Joliot Curie

      Auditorium Joliot Curie

      IJCLab

      Building 100 15 rue Georges Clémenceau 91400 Orsay Fance