Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

European HTCondor Workshop 2018

4–7 Sept 2018

RAL

Europe/London timezone

Support

hepix-2018condorworkshop-support@hepix.org

Session

Workshop presentations

4 Sept 2018, 14:00

CR12, R68 (RAL)

CR12, R68

RAL

Science and Technology Facilities Council Rutherford Appleton Laboratory Harwell Campus Didcot OX11 0QX United Kingdom Tel: +44 (0)1235 445 000 Fax: +44 (0)1235 445 808 N 51° 34' 27.6" W 1° 18' 52.6" (51.57433,-1.31462)

Workshop presentations

Helge Meinhard (CERN)

Workshop presentations

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

Workshop presentations

Christoph Beyer

Workshop presentations

Antonio Puertas Gallardo (European Commission)

Workshop presentations

Catalin Condurache (Science and Technology Facilities Council STFC (GB))

Workshop presentations

Chris Brew (Science and Technology Facilities Council STFC (GB))

Workshop presentations

Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

Workshop presentations

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

Workshop presentations

Antonio Puertas Gallardo (European Commission)

Workshop presentations

Chris Brew (Science and Technology Facilities Council STFC (GB))

Workshop presentations

Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

Workshop presentations

Christoph Beyer

There are no materials yet.

Andrew Sansum (STFC)

04/09/2018, 14:00

Short introduction to UKRI, STFC, RAL

Miron Livny (University of Wisconsin-Madison)

04/09/2018, 14:15

Catalin Condurache (Science and Technology Facilities Council STFC (GB))

04/09/2018, 14:25

Workshop logistics

3. ClassAd Language Tutorial

Gregory Thain (University of Wisconsin - Madison)

04/09/2018, 14:35

HTCondor presentations and tutorials

HTCondor uses the ClassAd language in three different ways. This tutorial will cover the full syntax of the ClassAd language, the uses in HTCondor, and advanced topics in ClassAd usages for system administration and monitoring.

24. Managing Cluster Fragmentation using ConcurrencyLimits

Max Fischer (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))

04/09/2018, 15:10

HTCondor presentations and tutorials

Clusters running differently sized jobs can easily suffer from fragmentation: Large chunks of free resources are required to run larger jobs, but smaller jobs can block parts of these chunks, making the remainder too small. For example, clusters in the WLCG must provide space for 8-core jobs, while there is a constant pressure of 1-core jobs. Common approaches to this issue are the DEFRAG...

4. HTCondor Administration Tutorial

Gregory Thain (University of Wisconsin - Madison)

04/09/2018, 16:05

HTCondor presentations and tutorials

This tutorial covers the basic installation and configuration of the HTCondor system. Theory of operation, and system architecture is also covered.

17. What defines a workload as High Throughput Computing

Miron Livny (University of Wisconsin-Madison)

04/09/2018, 17:10

HTCondor presentations and tutorials

Distinguishing characteristics of High Throughput Computing (HTC), including how it contrasts with High Performance Computing (HPC). When is HTC appropriate, when is HPC appropriate? Also lessons and best practices learned from experiences running the Open Science Grid, a 100+ institution distributed HTC environment.

2. HTCondor command line monitoring tool

Mr Davda Vipul (University of Oxford)

05/09/2018, 09:00

HTCondor presentations and tutorials

The University of Oxford Tier-2 Grid cluster converted to using HTCondor in 2014. At that time, there was no suitable monitoring tool available. The Oxford team developed a command line tool, written in Python, that displays snapshot information about the running jobs. The tool provides the capability of reporting on the number of jobs running on a given node and the efficiency of each job....

13. What's New in HTCondor?

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

05/09/2018, 09:25

HTCondor presentations and tutorials

An overview of recent developments and future plans in HTCondor.

7. HTCondor-CE Overview and Architecture

James Frey (University of Wisconsin Madison (US))

05/09/2018, 10:00

HTCondor presentations and tutorials

The HTCondor-CE provides a remote API on top of a local site batch system.

10. Scaling HTCondor at CERN

Ben Jones (CERN)

05/09/2018, 11:05

HTCondor presentations and tutorials

HTCondor has been the primary production batch service at CERN for the last couple of years, passing the 100k core mark last year. The challenge has been to scale the service, in terms of course of the number of resources, but also in terms of the number of heterogenous use cases. The use cases involve dedicated LHC Tier-0 pools, dedicated resources within standard pools, special CE routes to...

5. Configuring Group Quotas, Policies, and Fair Share across Users with the HTCondor Negotiatior

Gregory Thain (University of Wisconsin - Madison)

05/09/2018, 11:30

HTCondor presentations and tutorials

This tutorial covers HTCondor's "Fair Share" mechanisms for assigning resources to users, configuring groups of users with quotas, and other aspects of global policy via the HTCondor negotiator.

23. DESY Features on Top of HTCondor

Thomas Finnern (DESY)

05/09/2018, 14:00

HTCondor presentations and tutorials

The talk provides some details of special DESY configurations. It focuses on features we need for user registry integration, node maintenance operations and fair share / quota handling. With the help of job transformations defining job classes and proper job duration and memory setting, we setup a smooth and transparent operating model.

11. Haggis: Accounting Group Management at CERN

Mr Nikolaos Petros Triantafyllidis (CERN)

05/09/2018, 14:25

HTCondor presentations and tutorials

Haggis is an information system used to map CERN users to HTCondor accounting groups as well as hold information about quota and priority allocation per accounting group as well as information relevant to resource usage accounting. It enforces a tree-like domain model that supports resource mapping under different compute pools. All the data stored in Haggis is completely manageable by the...

6. Networking Concepts in HTCondor

James Frey (University of Wisconsin Madison (US))

05/09/2018, 14:50

HTCondor presentations and tutorials

How HTCondor deals with network architecture difficulties.

21. Using Python to monitor and control HTCondor

John Knoeller (University of Wisconsin-Madison)

05/09/2018, 15:25

HTCondor presentations and tutorials

Introduction to the HTCondor python bindings and their use to query HTCondor.

1. HTCondor configuration with puppet

Dr Lukasz Kreczko (University of Bristol (GB))

05/09/2018, 16:30

HTCondor presentations and tutorials

Configuring a condor cluster and keeping the configuration synchronised can be quite the chore. For this purpose, under the umbrella of HEP-Puppet, sysadmins have gathered to create a simple-to-use Puppet module. With just a few lines of YAML (hiera) you can configure your own HTCondor cluster within minutes (Puppet infrastructure provided). This talk will showcase the module with snippets...

22. Using Python to submit jobs

John Knoeller (University of Wisconsin-Madison)

05/09/2018, 16:55

HTCondor presentations and tutorials

Tutorial on using python to submit jobs to HTCondor, concentrating on the 8.7 series improvements in the HTCondor python bindings.

25. Bringing together HTCondor, Python, and Jupyter

Miron Livny (University of Wisconsin-Madison)

05/09/2018, 17:30

HTCondor presentations and tutorials

Miron Livny would like to lead a discussion on how to best interface with HTCondor when working inside a Python environment, especially an interactive science-based environment such as Jupyter Notebook / Lab. We have been experimenting with some approaches at UW-Madison that we can share, but what we are looking for an open discussion of ideas, feedback, and suggestions.

8. HTCondor Annex: Elasticity into the Public Cloud

James Frey (University of Wisconsin Madison (US))

06/09/2018, 09:00

HTCondor presentations and tutorials

Learn how the Annex allows you to seamless expand your HTCondor pool using machines from Amazon EC2.

12. Key Challenge Areas for Distributed High Throughput Computing

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

06/09/2018, 09:35

HTCondor presentations and tutorials

Based on current trends and past experience, this talk will identify and discuss six key challenge areas that will continue to drive High Throughput Computing technologies innovation in the years to come.

26. Day-to-day HTCondor Operations at RAL

John Kelly (S)

06/09/2018, 10:10

HTCondor presentations and tutorials

RAL Tier-1 originally used the PBS batch system for its Grid related activities. Increased LHC operation requirements exposed scalability problems, therefore other batch systems were taken into consideration.

In this presentation we review the history of HTCondor at RAL and detail on how it evolved from an initial conventional setup with cgroups for resource control to current use of Docker...

27. General Integration Issues

Mr Stephen Jones (GridPP/Liverpool)

06/09/2018, 11:05

HTCondor presentations and tutorials

HTCondor is a product, but it is not an application. Like operating systems, networks, database management systems, and security infrastructures, HTCondor is a general system, upon which other applications may be built.

Extra work is needed to create something useful from HTCondor. The extra work depends on the goals of the designer. This talk identifies a few general areas that need to be...

28. Cloud scavenging with HTCondor in the EOSCpilot Fusion Science Demonstrator

Dr Andrew Lahiff (UKAEA)

06/09/2018, 11:30

HTCondor presentations and tutorials

Access to both HTC and HPC facilities is vitally important to the fusion community, not only for plasma modelling but also for advanced engineering and design, materials research, rendering, uncertainty quantification and advanced data analytics for engineering operations. The computing requirements are expected to increase as the community prepares for the next generation facility, ITER....

20. Config and Submit language

John Knoeller (University of Wisconsin-Madison)

06/09/2018, 11:55

HTCondor presentations and tutorials

Discussion of the language used by HTCondor for configuration and job submit files.

Request 30 Minute time slot.

9. Workflows with HTCondor’s DAGMan

James Frey (University of Wisconsin Madison (US))

06/09/2018, 14:00

HTCondor presentations and tutorials

DAGMan lets you manage large, complex workflows in HTCondor.

14. SciTokens: Moving away from identity credentials to capability tokens in HTCondor

Todd Tannenbaum (University of Wisconsin Madison (US))

06/09/2018, 14:35

HTCondor presentations and tutorials

We believe that distributed, scientific computing community has unique authorization needs that can be met by utilizing common web technologies, such as OAuth 2.0 and JSON Web Tokens (JWT). The SciTokens team, a collaboration between technology providers including the HTCondor Project and domain scientists, is working to build and demonstrate a new authorization approach at scale.

31. Pushing HTCondor boundaries: the CMS Global Pool experience

Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

06/09/2018, 15:10

HTCondor presentations and tutorials

In recent times, the CMS HTCondor Global Pool, which unifies access and management to all CPU resources available to the experiment, has been growing in size and evolving in its complexity, as new resources and job submit nodes are being added to the design originally conceived to serve the collaboration during the LHC Run 2. Having achieved most of our milestones for this period, the pool...

32. The CMS global pool, from a pilot-based to an heterogeneous mix of resources

Diego Davila Foyo (Autonomous University of Puebla (MX))

06/09/2018, 15:35

HTCondor presentations and tutorials

Nowadays computational resources come in a wide variety of forms from pilots running on sites, cloud resources and spare cycles on desktops, laptops and even phones through volunteer computing and our duty, as the Submission Infrastructure team at CMS, is to be able to use them all.
When it comes to Integrate these different models into a single pool of resources, different challenges arise....

29. A versatile environment for large-scale geospatial data processing with HTCondor

Dr Dario Rodriguez Aseretto (European Commission)

06/09/2018, 16:30

HTCondor presentations and tutorials

Geospatial data are one of the core data sources for scientific and technical support to the European Commission (EC) policies. For instance, the Copernicus programme of the European Union provides a vast amount of Earth Observation (EO) data for monitoring the environment through the Sentinel satellites operated by the European Space Agency. In terms of data management and processing, big...

18. Job and Machine Policy

John Knoeller (University of Wisconsin-Madison)

06/09/2018, 16:55

HTCondor presentations and tutorials

Discussion of policy expressions available to users when the submit their HTCondor jobs, and expressions available to Administrators when they configure HTCondor execute nodes. Time permitting, there will be a demonstration of special purpose execution slots.

Request 60 Minute slot.

38. RAL Tier-1 strategy - Growing the UK community

Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))

06/09/2018, 17:40

HTCondor presentations and tutorials

In 2013 the RAL Tier-1 switched its batch farm to using HTCondor. In the years following, several more UK sites have made the switch. The RAL Tier-1 batch farm is now well over 20000 job slots and HTCondor is a key service delivering our pledged resources to the WLCG, now and for the forseeable future.

New funding opportunities are available to provide computing in the UK to the "long tail"...

15. Monitoring HTCondor

Todd Tannenbaum (University of Wisconsin Madison (US))

07/09/2018, 09:00

HTCondor presentations and tutorials

An overview of monitoring an HTCondor pool

16. Put your jobs into a box: Using Cgroups, Docker, Singularity

Gregory Thain (University of Wisconsin - Madison)

07/09/2018, 09:25

HTCondor presentations and tutorials

Overview of HTCondor's mechanisms in support of job isolation, including Docker, Singularity, cgroups, and namespace mounts.

30. Sharing group-owned clusters at the via a 'super'-collector/negotiator at the UNIMI Physics Department.

Francesco Prelz (Università degli Studi e INFN Milano (IT))

07/09/2018, 10:00

HTCondor presentations and tutorials

A setup to share clusters that used to be owned and operated by experimental and theory sub-groups in the Physics Department of the University of Milan is described. Each sub-cluster is configured as a separate Condor Pool, reporting to one additional 'super'-collector. With a few assumptions on the available execution environment, plus mutually agreed priorities for 'local' jobs, this allows...

33. Building a LIGO HTCondor site on top of a shared HPC cluster

Dr Paul Hopkins (Cardiff University)

07/09/2018, 10:55

HTCondor presentations and tutorials

All members of the LIGO Scientific Collaboration have access to a handful of dedicated LIGO Data Grid clusters which feature HTCondor, system-installed software, the LIGO and Virgo data, and other standard components. Cardiff University also host a LIGO Data Grid Site, but this is built on top of the shared institutional HPC cluster. In this talk I describe how I used HTCondor, Spack,...

19. Schedd Job Transforms

John Knoeller (University of Wisconsin-Madison)

07/09/2018, 11:20

HTCondor presentations and tutorials

Discussion of the Job Transform language in the HTCondor Schedd.

Request 30 Minute time slot.

36. Workshop wrap-up

Helge Meinhard (CERN)

07/09/2018, 11:55

HTCondor presentations and tutorials

Building timetable...