HTCondor Workshop Autumn 2020

Europe/Paris
(teleconference only)

(teleconference only)

Helge Meinhard (CERN), Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
Description

The HTCondor Workshop Autumn 2020 was held as a virtual event via videoconferencing, due to current pandemic and related travel restrictions.

The workshop was the sixth edition of the series usually hosted in Europe after the successful events at CERN in December 2014, ALBA in February 2016, DESY in June 2017, RAL in September 2018 and JRC in September 2019.

The workshops are opportunities for novice and experienced users of HTCondor to learn, get help and have exchanges between them and with the HTCondor developers and experts. They are open to everyone world-wide; they consist of presentations, tutorials and "office hours" for consultancy, covering the HTCondor CE (Compute Element) prominently as well. They also feature presentations by users on their projects and experiences.

The workshops address participants from academia and research as well as from commercial entities alike.

 
 
Participants
  • Adam Wong
  • Adrian Coveney
  • Adrien Ramparison
  • Ajit Kumar Mohapatra
  • Al Marsella
  • Alberto Sanchez Hernandez
  • Ales Prchal
  • Alessandra Doria
  • Alessandro Italiano
  • Alex Butterworth
  • Alexandr Mikula
  • Alexandre Rademaker
  • Alexey Smirnov
  • Alon Glazer
  • Alyssa Bramley
  • Andre McNeill
  • Andrea Chierici
  • Andrea Sartirana
  • Andreas Haupt
  • Andreas Nowack
  • Andreas Petzold
  • Andres Tanasijczuk
  • Andrew Lahiff
  • Andrew Stein
  • Andria Arisal
  • Animesh Narayan Dangwal
  • Ankur Singh
  • Annajiat Alim Rasel
  • Annie Ma-Weaver
  • Anthony Richard Tiradani
  • Antonio Perez Fernandez
  • Antonio Perez-Calero Yzquierdo
  • Anuradha Samajdar
  • Asaph Zemach
  • Baptiste MARY
  • Ben Jones
  • Berenice Cervantes
  • Bert DeKnuydt
  • Bertrand Rigaud
  • Boris Sadkhin
  • Brian Bockelman
  • Brian Hua Lin
  • Bruno Coimbra
  • Bryce Cousins
  • Burt Holzman
  • Carles Acosta
  • Carsten Aulbert
  • Catalin Condurache
  • Cheryl Zhang
  • Chloe Huang
  • Chris Brew
  • Chris Reynolds
  • Chris Theis
  • Christian Neissner
  • Christina Koch
  • christoph beyer
  • Christophe MIGEON
  • Clark Gaylord
  • Cleber Paiva de Souza
  • Clemens Lange
  • Colin Smith
  • Craig Parker
  • Csaba Hajdu
  • Dan Moraru
  • Darek Kedra
  • Dave Dykstra
  • David Berghaus
  • David Cohen
  • David Rebatto
  • David Schultz
  • Davide Michelino
  • Dennis Yip
  • Doina Cristina Duma
  • Doug Benjamin
  • Edita Kizinevic
  • Elisabetta Vilucchi
  • Emanuele Leonardi
  • Emanuele Simili
  • Emmanouil Vamvakopoulos
  • Eric Chassande-Mottin
  • Eric Fede
  • Eric Gross
  • Eric Winter
  • Erich Birngruber
  • Evgeniy Kuznetsov
  • Fabio Hernandez
  • Farrukh Khan
  • Federica Fanzago
  • Feyza Eryol
  • Francesco Prelz
  • Frank Sauerburger
  • Frederic Gillardo
  • Frederique Chollet
  • Gabor Biro
  • Gabriele Garzoglio
  • Gang Chen
  • Garhan Attebury
  • Gavin McCance
  • Geonmo Ryu
  • Gian Luigi D'Alessandro
  • Gianmauro Cuccuru
  • Giuseppe Di Biase
  • Giuseppe Platania
  • Giusy Sergi
  • Glen MacLachlan
  • Grant Goodyear
  • Greg Daues
  • Gregorio Carullo
  • Gregory Mendell
  • Gregory Thain
  • Guilherme Sousa
  • Guillaume Cochard
  • Guy Tel-Zur
  • Götz Waschk
  • Harald van Pee
  • Haykuhi Musheghyan
  • Hector Camilo Zambrano Hernandez
  • Heinz-Hermann Adam
  • Helge Meinhard
  • Hemanta Kumar G
  • Ian Loader
  • Ian Smith
  • Ido Shamay
  • Ievgen Sliusar
  • issouf kindo
  • Iuri La Rosa
  • Jaideep Joshi
  • Jaime Frey
  • James Letts
  • James Thorne
  • James Walder
  • Jamie Rajewski
  • Janusz Oleniacz
  • Jaroslava Schovancova
  • Jason Patton
  • Jaydeep Mody
  • Jean-Claude Chevaleyre
  • Jeff Templon
  • Jian Yang
  • Jim Basney
  • Jiri Chudoba
  • John Kewley
  • John Knoeller
  • John Sanabria
  • Jos Daleman
  • Jose Caballero Bejar
  • Jose Flix Molina
  • JOSEPH ABENA ABENA
  • Joseph Areeda
  • Josh Drake
  • Josh Karpel
  • Juan Luis Font
  • Julio Ibarra
  • Junheng Wang
  • karan bhatia
  • Karen Fernsler
  • Kevin Fitzptrick
  • Kevin Heinold
  • Kevin Kissell
  • Kevin Retzke
  • Kosta Polyzos
  • Krishnaiah Marichetty
  • Kruno Sever
  • Laurren Michael
  • Lepeke Phukungoane
  • Ludovic DUFLOT
  • Luis Fernandez Alvarez
  • Luke Thorne
  • Maciej Pawlik
  • Manuel Giffels
  • Marco Mambelli
  • Marco Mascheroni
  • Marcus Ebert
  • Maria Acosta Flechas
  • Marian Zvada
  • Mark Baker
  • Mark Coatsworth
  • Martin Gasthuber
  • Martin Sajdl
  • Mary Hester
  • Massimo Biasotto
  • Matthew West
  • Matyas Selmeci
  • Max Fischer
  • Merina Albert
  • Michael Leech
  • Michael McClenahan
  • Michael Pelletier
  • Michel Jouvin
  • Miguel Viana
  • Mihai Constantin Duta
  • Mike Stanfield
  • Mirica Yancey
  • Miron Livny
  • Nicholas Von Wolff
  • Nikola Hardi
  • NILOTPAL MRINAL
  • Nils Høimyr
  • Oliver Freyermuth
  • Orest Dorosh
  • Ornella Juliana Piccinni
  • Pablo Llopis Sanmillan
  • Paige Kulzer
  • Paris Gianneios
  • Patryk Lason
  • Pau Cutrina Vilalta
  • Peet Whittaker
  • Peter Wienemann
  • Philippe Grassia
  • Philippe SERAPHIN
  • Prajesh Sharma
  • Prasun Singh Roy
  • Punnatat Thaprasop
  • RASHMI RANJAN ROUTARAY
  • Raymond Yeung
  • Robert Frank
  • Sabry Razick
  • Sam Newman
  • Sang Un Ahn
  • Sanjit Sahu
  • Saqib Haleem
  • Sean Murray
  • Sean Sweeney
  • Selcuk Bilmis
  • Sergio Fantinel
  • Shalini Epari
  • Shaurya Chanana
  • Shiyan Wei
  • Shkelzen Rugovac
  • Shreyas Bhat
  • Sophie Ferry
  • Srinivasa R
  • Stefano Dal Pra
  • Steffen Grunewald
  • Stephane GERARD
  • Stephen Boyack
  • Steve Elliott
  • Stuart Anderson
  • Sudeep Narayan Banerjee
  • Swathi Ramesh
  • Thomas Hartmann
  • Thomas Kiesel
  • Tim Theisen
  • Todd Miller
  • Todd Tannenbaum
  • Tomas Lindén
  • Tullio Macorini
  • Vanessa Hamar
  • Victor Mendoza
  • Vikas Singhal
  • Vikram Gazula
  • Vishal Mahendra
  • Werner Koppelstätter
  • Xavier Ouvrard
  • Yann COSTES
  • Zacarias Benta
  • Zach Miller
  • Zalak Shah
    • 14:00 14:50
      Hallway time 50m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

      <a href="https://cern.zoom.us/j/94530716058">https://cern.zoom.us/j/94530716058</a>
    • 14:50 16:40
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Michel Jouvin (Université Paris-Saclay (FR)), Catalin Condurache (EGI Foundation), Gregory Thain (University of Wisconsin-Madison)
    • 16:40 16:55
      Break 15m
    • 16:55 18:05
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Chris Brew (Science and Technology Facilities Council STFC (GB)), Christoph Beyer, Helge Meinhard (CERN)
      • 16:55
        HTCondor deployment at CC-IN2P3 20m

        In recent months the HTCondor has been the main workload management system for the Grid environment at CC-IN2P3. The computing cluster consists of ~640 worker nodes of various types which deliver in a total of ~27K execution slots (including hyperthreading). The system supports LHC experiments (Alice, Atlas, CMS, and LHCb) under the umbrella of the Worldwide LHC Computing Grid (WLCG) as a Tier 1 site and other various experiments and research groups under the umbrella of European Grid Infrastructure (EGI). This presentation will provide a brief description of the installation, the configuration aspects of the HTCondor cluster. Besides, we will present the use of the HTCondor-CE grid gateway at CC-IN2P3.

        Speaker: Dr Emmanouil Vamvakopoulos (CCIN2P3/CNRS)
      • 17:15
        Replacing LSF with HTCondor: the INFN-T1 experience. 20m

        CNAF started working with HTCondor during spring 2018,
        planning to move its Tier-1 Grid Site based on CREAM-CE and LSF
        Batch System to HTCondor-CE and HTCondor. The phase out of CREAM and
        LSF was completed by spring 2020. This talk describes our experience
        with the new system, with particular focus on HTCondor .

        Speaker: Stefano Dal Pra (Universita e INFN, Bologna (IT))
      • 17:35
        HTCondor Philosophy and Architecture Overview 30m
        Speaker: Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
    • 18:05 18:30
      Hallway time 25m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:00 14:50
      Hallway time 50m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:50 16:25
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Helge Meinhard (CERN), Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno), Michel Jouvin (Université Paris-Saclay (FR))
      • 14:50
        What is new in HTCondor? What is upcoming? 20m
        Speaker: Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
      • 15:10
        Installing HTCondor 15m
        Speaker: Mark Coatsworth (UW Madison)
      • 15:25
        Pslots, draining, backfill: Multicore jobs and what to do with them 20m
        Speaker: Gregory Thain (University of Wisconsin-Madison)
      • 15:45
        HTC at DESY 20m

        In 2016 the local (BIRD) and GRID DESY batch facilities were migrated to HTCondor, this talk will cover some of the experiences and developments we saw over the time and the plans fot the future of HTC at DESY.

        Speaker: Christoph Beyer
      • 16:05
        HTCondor at GRIF 20m

        GRIF is a distributed Tier-2 WLCG site grouping four laboratories in the Paris Region (IJCLab, IRFU, LLR, LPNHE). Multiple HTCondor instances are deployed at GRIF since several years. In particular an ARC-CE + HTCondor system provides access to the computing resources of IRFU and a distributed HTCondor pool, with CREAM-CE and Condor-CE gateways, gives unified access to the IJCLab and LLR resources. We propose a short talk (10min max) giving a quick overview of the HTCondor installations at GRIF and some feedback from the GRIF grid administrators.

        Speaker: Andrea Sartirana (Centre National de la Recherche Scientifique (FR))
    • 16:25 16:40
      Break 15m
    • 16:40 18:00
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Christoph Beyer, Michel Jouvin (Université Paris-Saclay (FR)), Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
      • 16:40
        Archival, anonymization and presentation of HTCondor logs with GlideinMonitor 20m

        GlideinWMS is a pilot framework to provide uniform and reliable HTCondor clusters using heterogeneous and unreliable resources. The Glideins are pilot jobs that are sent to the selected nodes, test them, set them up as desired by the user jobs, and ultimately start an HTCondor schedd to join an elastic pool. These Glideins collect information that is very useful to evaluate the health and efficiency of the worker nodes and invaluable to troubleshoot when something goes wrong. This includes local stats, the results of all the tests, and the HTCondor log files, and it is packed and sent to the GlideinWMS Factory.
        Access to these logs for developers takes long back and forth with Factory operators and manual digging into files. Furthermore, these files contain information like user IDs and email and IP addresses, that we want to protect and limit access to.
        GlideinMonitor is a Web application to make these logs more accessible and useful:
        - it organizes the logs in an efficient compressed archive
        - it allows to search, unpack, and inspect them, all in a convenient and secure Web interface
        - via plugins like the log anonymizer, it can redact protected information preserving the parts useful for troubleshooting

        Speaker: Marco Mambelli (University of Chicago (US))
      • 17:00
        Status and Plans of HTCondor Usage in CMS 20m

        The resource needs of high energy physics experiments such as CMS at the LHC are expected to grow in terms of the amount of data collected and the computing resources required to process these data. Computing needs in CMS are addressed through the "Global Pool" a vanilla dynamic HTCondor pool created through the glideinWMS software. With over 250k cores, the CMS Global Pool is the biggest HTCondor pool in the world, living at the forefront of HTCondor limits and facing unique challenges. In this contribution, we will give an overview of the Global Pool, focusing on the workflow managers connected to it and the unique HTCondor features used by them. Then, we will describe the monitoring tools developed to make sure the pool works correctly. We will also analyze the efficiency and scalability challenges faced by the CMS experiment. Finally, plans and challenges for the future will be addressed.

        Speaker: Marco Mascheroni (Univ. of California San Diego (US))
      • 17:20
        Classified Ads in HTCondor 20m
        Speaker: James Frey (University of Wisconsin Madison (US))
      • 17:40
        Job Submission Transformations 20m
        Speaker: John Knoeller (University of Wisconsin-Madison)
    • 18:00 19:00
      Office hour
      • 18:00
        Administrating HTCondor at a local site 1h https://cern.zoom.us/j/92420227039

        https://cern.zoom.us/j/92420227039

        For system admins installing and/or configuring an HTCondor pool on their campus

      • 18:00
        General Office Hour Lobby 1h https://cern.zoom.us/j/97987309455

        https://cern.zoom.us/j/97987309455

        For general questions, open discussions, getting started

      • 18:00
        HTCondor-CE, Grid, and Federation 1h https://cern.zoom.us/j/98439799794

        https://cern.zoom.us/j/98439799794

        Questions about grid/cloud: CE, OSG, WLCG, EGI, bursting to HPC/Cloud, etc.

      • 18:00
        Using HTCondor 1h https://cern.zoom.us/j/94530716058

        https://cern.zoom.us/j/94530716058

        For people who want to submit workflows and have questions about using the command line tools or developer APIs (Python, REST)

    • 14:00 14:50
      Hallway time 50m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:50 16:25
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Catalin Condurache (EGI Foundation), Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA), Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)
      • 14:50
        HTCondor-CE Overview 35m
        Speaker: Brian Hua Lin (University of Wisconsin - Madison)
      • 15:25
        Replacing CREAM-CE with HTCondor-CE: the INFN-T1 experience 20m

        CNAF started working with the HTCondor Computing Element from May
        2018, planning to move its Tier-1 Grid Site based on CREAM-CE and LSF
        Batch System to use HTCondor-CE and HTCondor. The phase out of CREAM
        and LSF was completed by spring 2020. This talk describes our
        experience with the new system, with particular focus on HTCondor-CE.

        Speaker: Stefano Dal Pra (Universita e INFN, Bologna (IT))
      • 15:45
        HTCondor-CE Configuration 20m
        Speaker: Brian Hua Lin (University of Wisconsin - Madison)
      • 16:05
        How I Learned to Stop Worrying and Love the HTCondor-CE 20m

        This contribution provides firsthand experience of adopting HTCondor-CE at German WLCG sites DESY and KIT. Covering two sites plus a remote setup for RWTH Aachen, we share our lessons learned in pushing HTCondor-CE to production. With a comprehensive recap from technical setup, a detour to surviving the ecosystem and accounting, to the practical Dos and Donts, this contribution is suitable for all people that are considering, struggling or already successful in adopting HTCondor-CE as well.

        Speaker: Max Fischer (Karlsruhe Institute of Technology)
    • 16:25 16:40
      Break 15m
    • 16:40 18:00
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Gregory Thain (University of Wisconsin-Madison), Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno), Christoph Beyer
      • 16:40
        HTCondor-CE Live Installation 15m
        Speaker: Brian Hua Lin (University of Wisconsin - Madison)
      • 16:55
        HTCondor-CE Troubleshooting 15m
        Speaker: Brian Hua Lin (University of Wisconsin - Madison)
      • 17:10
        What is next for the HTCondor-CE? 10m
        Speaker: Brian Hua Lin (University of Wisconsin - Madison)
      • 17:20
        Running a large multi-purpose HTCondor pool at CERN 20m

        A review of how we run and operate a large multi purpose condor pool, with grid, local submission and dedicated resources. Using grid and local submission to drive utilisation of shared resources. Using transforms and routers in order to ensure jobs end up on the correct resources, and are accounted correctly. We will review our automation and monitoring tools, together with integration of externally hosted and opportunistic resources.

        Speaker: Ben Jones (CERN)
      • 17:40
        Challenge of the Migration of the RP-Coflu-Cluster @ CERN 20m

        The Coflu Cluster, also known as the Radio-Protection (RP) Cluster, started as an experimental project at CERN involving a few standard desktop computers, in 2007. It was envisaged to have a job scheduling system and a common storage space so that multiple Fluka simulations could be run in parallel and monitored, utilizing a custom built and easy-to-use web-interface.

        Abstract The infrastructure is composed of approximately 500 cores, and relies on HTCondor as an open-source high-throughput computing software framework for the execution of Fluka simulation jobs. Before the migration that was carried out over these last three months, nodes where running under Scientific Linux 6 and HT Condor mostly in the latest HT Condor 7 version. The web interface—based on JavaScript and PHP—allowing job submission was relying intensively on the Quill database hosted in CERN's “database on demand” infrastructure.

        Abstract In this talk, we discuss the challenges of migrating HTCondor to its latest version on our infrastructure, which required solving different challenges: replacing the Quill database used intensively in the web interface for supporting the submission and management of jobs, updating a whole system with the least interruption of the production, by gradually migrating its components to both the latest version of HT Condor and Centos 7.

        Abstract We then terminate this presentation by the project of migrating this infrastructure to the CERN HT Condor pool.

        Speaker: Xavier Eric Ouvrard (CERN)
    • 18:00 18:30
      Hallway time 30m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:00 14:50
      Hallway time 50m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:50 16:30
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA), Chris Brew (Science and Technology Facilities Council STFC (GB)), Helge Meinhard (CERN)
      • 14:50
        HTCondor with Containers and Kubernetes 25m
        Speaker: Gregory Thain (University of Wisconsin-Madison)
      • 15:15
        Combining cloud-native workflows with HTCondor jobs 20m

        The majority of physics analysis jobs at CERN are run on high-throughput computing batch systems such as HTCondor. However, not everyone has access to computing farms, e.g. theorist wanting to make use of CMS Open Data, and for reproducible workflows more backend-agnostic approaches are desirable. The industry standard here are containers leveraged with Kubernetes, for which computing resources can easily be acquired on-demand using public cloud offerings. This causes a disconnect between how current HEP physics analysis are performed and how they could be reused: when developing a fully "cloud native" computing approach for physics analysis, one still needs to have access to the ten-thousands of cores available on classical batch system to have sufficient resources for the data processing.

        In this presentation, I will demonstrate how complex physics analysis workflows that are written and scheduled using a rather small Kubernetes cluster can make use of CERN's HTCondor installation. An "operator" is used to submit jobs to HTCondor and---once completed---collect the results and continue the workflow in the cloud. The audience will also learn the important role that software containers and Kubernetes play in the context of open science.

        Speaker: Clemens Lange (CERN)
      • 15:35
        HTCondor in Production: Seamlessly automating maintenance, OS and HTCondor updates, all integrated with HTCondor's scheduling 20m

        Our HTC cluster using HTCondor has been set up at Bonn University in 2017/2018.
        All infrastructure is fully puppetised, including the HTCondor configuration.

        OS updates are fully automated, and necessary reboots for security patches are scheduled in a staggered fashion backfilling all draining nodes with short jobs to maximize throughput.
        Additionally, draining can also be scheduled for planned maintenance periods (with optional backfilling) and tasks to be executed before a machine is rebooted or shutdown can be queued.
        This is combined with a series of automated health checks with large coverage of temporary and long-term machines failures or overloads, and monitoring performed using Zabbix.

        In the last year, heterogeneous ressources with different I/O capabilities have been integrated and MPI support has been added. All jobs run inside Singularity containers allowing also for interactive,
        graphical sessions with GPU access.

        Speaker: Oliver Freyermuth (University of Bonn (DE))
      • 15:55
        HTCondor Annex: Bursting into Clouds 20m
        Speaker: Todd Lancaster Miller (University of Wisconsin Madison (US))
      • 16:15
        CHTC Partners with Google Cloud to Make HTCondor Available on the Google Cloud Marketplace 15m

        We're excited to share the launch of the HTCondor offering on the Google Cloud Marketplace, built by Google software engineer Cheryl Zhang with advice and support from the experts at the CHTC. Come see how quickly and easily you can start using HTCondor on Google Cloud with this new solution.

        Speaker: Cheryl Zhang (Google Cloud)
    • 16:30 16:45
      Break 15m
    • 16:45 18:00
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Catalin Condurache (EGI Foundation), Helge Meinhard (CERN), Gregory Thain (University of Wisconsin-Madison)
      • 16:45
        HTCondor Offline: Running on isolated HPC Systems 20m
        Speaker: James Frey (University of Wisconsin Madison (US))
      • 17:05
        HEPCloud use of HTCondor to access HPC Centers 15m

        HEPCloud is working to integrate isolated HPC Centers, such as Theta at Argonne
        National Laboratory, into the pool of resources made available to its user
        community. Major obstacles to using these centers include limited or no outgoing
        networking and restrictive security policies. HTCondor has provided a mechanism
        to execute jobs in a manner that satisfies the constraints and policies. In
        this talk we will discuss the various ways we use HTCondor to collect and execute
        jobs on Theta.

        Speaker: Anthony Richard Tiradani (Fermi National Accelerator Lab. (US))
      • 17:20
        HPC backfill with HTCondor at CERN 20m

        The bulk of computing at CERN consists of embarrassingly parallel HTC use cases (Jones, Fernandez-Alavarez et al), however for MPI applications for e.g. Accelerator Physics and Engineering, a dedicated HPC cluster running SLURM is used.
        In order to optimize utilization of the HPC cluster, idle nodes in SLURM cluster are backfilled with Grid HTC workloads. This talk will detail the CondorCE setup that enables backfill to the SLURM HPC cluster with pre-emptable Grid jobs.

        Speaker: Pablo Llopis Sanmillan (CERN)
      • 17:40
        HTCondor monitoring at ScotGrid Glasgow 20m

        Our Tier2 cluster (ScotGrid, Glasgow) uses HTCondor as batch system, combined with ARC-CE as front-end for job submission and ARGUS for authentication and user mapping.
        On top of this, we have built a central monitoring system based on Prometheus that collects, aggregates and displays metrics on custom Grafana dashboards. In particular, we extract jobs info by regularly parsing the output of 'condor_status' on the condor_manager, scheduler, and worker nodes.
        A collection of graphs gives a quick overlook of cluster performance and helps identify rising issues. Logs from all nodes and services are also collected to a central Loki server and retained over time.

        Speaker: Emanuele Simili (University of Glasgow)
    • 18:00 19:00
      Office hour
      • 18:00
        Administrating HTCondor at a local site 1h https://cern.zoom.us/j/92420227039

        https://cern.zoom.us/j/92420227039

        For system admins installing and/or configuring an HTCondor pool on their campus

      • 18:00
        General Office Hour Lobby 1h https://cern.zoom.us/j/97987309455

        https://cern.zoom.us/j/97987309455

        For general questions, open discussions, getting started

      • 18:00
        HTCondor-CE, Grid, and Federation 1h https://cern.zoom.us/j/98439799794

        https://cern.zoom.us/j/98439799794

        Questions about grid/cloud: CE, OSG, WLCG, EGI, bursting to HPC/Cloud, etc.

      • 18:00
        Using HTCondor 1h https://cern.zoom.us/j/94530716058

        https://cern.zoom.us/j/94530716058

        For people who want to submit workflows and have questions about using the command line tools or developer APIs (Python, REST)

    • 14:00 14:50
      Hallway time 50m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058

    • 14:50 16:15
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Chris Brew (Science and Technology Facilities Council STFC (GB)), Gregory Thain (University of Wisconsin-Madison), Catalin Condurache (EGI Foundation)
      • 14:50
        HTCondor at Nikhef 20m

        The Physics Data Processing group at Nikhef is developing a Condor-based cluster, after a 19-year absence from the HTCondor community. This talk will discuss why we are developing this cluster, and present our plans and the results so far. It will also spend a slide or two on the potential to use HTCondor for other services we provide.

        Speaker: Jeff Templon (Nikhef National institute for subatomic physics (NL))
      • 15:10
        HTCondor's Python API - The Python Bindings 25m
        Speaker: Jason Patton (UW Madison)
      • 15:35
        HTMap: Pythonic High Throughput Computing 10m
        Speaker: Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
      • 15:45
        Lightweight Site-Specific Dask Integration for HTCondor at CHTC 20m

        Dask is an increasingly-popular tool for both low-level and high-level parallelism in the Scientific Python ecosystem. I will discuss efforts at the Center for High Throughput Computing at UW-Madison to enable users to run Dask-based work on our HTCondor pool. In particular, we have developed a "wrapper package" based on existing work in the Dask ecosystem that lets Dask spawn workers in the CHTC pool without users needing to be aware of the infrastructure constraints we are operating under. We believe this approach is useful as a lightweight alternative to dedicated, bespoke infrastructure like Dask Gateway.

        Speaker: Mr Matyas Selmeci (University of Wisconsin - Madison)
      • 16:05
        REST API to HTCondor 10m
        Speaker: Matyas Selmeci (University of Wisconsin - Madison)
    • 16:15 16:30
      Break 15m
    • 16:30 17:55
      Workshop session https://cern.zoom.us/j/97987309455

      https://cern.zoom.us/j/97987309455

      Conveners: Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno), Christoph Beyer, Chris Brew (Science and Technology Facilities Council STFC (GB))
      • 16:30
        HTCondor Security: Philosophy and Administration Changes 30m
        Speakers: Zach Miller, Brian Paul Bockelman (University of Wisconsin Madison (US))
      • 17:00
        From Identity-Based Authorization to Capabilities: SciTokens, JWTs, and OAuth 20m

        In this presentation, I will introduce the SciTokens model (https://scitokens.org/) for federated capability-based authorization in distributed scientific computing. I will compare the OAuth and JWT security standards with X.509 certificates, and I will discuss ongoing work to migrate HTCondor use cases from certificates to tokens.

        Speaker: Jim Basney (University of Illinois)
      • 17:20
        Allow HTCondor jobs to securely access services via OAuth token workflow 20m
        Speakers: Jason Patton (UW Madison), Zach Miller
      • 17:40
        Workshop wrap-up 15m
        Speaker: Helge Meinhard (CERN)
    • 17:55 18:30
      Hallway time 35m https://cern.zoom.us/j/94530716058

      https://cern.zoom.us/j/94530716058