4–7 Sept 2018
RAL
Europe/London timezone

Session

Workshop presentations

4 Sept 2018, 14:00
CR12, R68 (RAL)

CR12, R68

RAL

Science and Technology Facilities Council Rutherford Appleton Laboratory Harwell Campus Didcot OX11 0QX United Kingdom Tel: +44 (0)1235 445 000 Fax: +44 (0)1235 445 808 N 51° 34' 27.6" W 1° 18' 52.6" (51.57433,-1.31462)

Conveners

Workshop presentations

  • Helge Meinhard (CERN)

Workshop presentations

  • Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

Workshop presentations

  • Christoph Beyer

Workshop presentations

  • Antonio Puertas Gallardo (European Commission)

Workshop presentations

  • Catalin Condurache (Science and Technology Facilities Council STFC (GB))

Workshop presentations

  • Chris Brew (Science and Technology Facilities Council STFC (GB))

Workshop presentations

  • Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

Workshop presentations

  • Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

Workshop presentations

  • Antonio Puertas Gallardo (European Commission)

Workshop presentations

  • Chris Brew (Science and Technology Facilities Council STFC (GB))

Workshop presentations

  • Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

Workshop presentations

  • Christoph Beyer

Presentation materials

There are no materials yet.

  1. Andrew Sansum (STFC)
    04/09/2018, 14:00

    Short introduction to UKRI, STFC, RAL

    Go to contribution page
  2. Miron Livny (University of Wisconsin-Madison)
    04/09/2018, 14:15
  3. Catalin Condurache (Science and Technology Facilities Council STFC (GB))
    04/09/2018, 14:25

    Workshop logistics

    Go to contribution page
  4. Gregory Thain (University of Wisconsin - Madison)
    04/09/2018, 14:35
    HTCondor presentations and tutorials

    HTCondor uses the ClassAd language in three different ways. This tutorial will cover the full syntax of the ClassAd language, the uses in HTCondor, and advanced topics in ClassAd usages for system administration and monitoring.

    Go to contribution page
  5. Max Fischer (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))
    04/09/2018, 15:10
    HTCondor presentations and tutorials

    Clusters running differently sized jobs can easily suffer from fragmentation: Large chunks of free resources are required to run larger jobs, but smaller jobs can block parts of these chunks, making the remainder too small. For example, clusters in the WLCG must provide space for 8-core jobs, while there is a constant pressure of 1-core jobs. Common approaches to this issue are the DEFRAG...

    Go to contribution page
  6. Gregory Thain (University of Wisconsin - Madison)
    04/09/2018, 16:05
    HTCondor presentations and tutorials

    This tutorial covers the basic installation and configuration of the HTCondor system. Theory of operation, and system architecture is also covered.

    Go to contribution page
  7. Miron Livny (University of Wisconsin-Madison)
    04/09/2018, 17:10
    HTCondor presentations and tutorials

    Distinguishing characteristics of High Throughput Computing (HTC), including how it contrasts with High Performance Computing (HPC). When is HTC appropriate, when is HPC appropriate? Also lessons and best practices learned from experiences running the Open Science Grid, a 100+ institution distributed HTC environment.

    Go to contribution page
  8. Mr Davda Vipul (University of Oxford)
    05/09/2018, 09:00
    HTCondor presentations and tutorials

    The University of Oxford Tier-2 Grid cluster converted to using HTCondor in 2014. At that time, there was no suitable monitoring tool available. The Oxford team developed a command line tool, written in Python, that displays snapshot information about the running jobs. The tool provides the capability of reporting on the number of jobs running on a given node and the efficiency of each job....

    Go to contribution page
  9. Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
    05/09/2018, 09:25
    HTCondor presentations and tutorials

    An overview of recent developments and future plans in HTCondor.

    Go to contribution page
  10. James Frey (University of Wisconsin Madison (US))
    05/09/2018, 10:00
    HTCondor presentations and tutorials

    The HTCondor-CE provides a remote API on top of a local site batch system.

    Go to contribution page
  11. Ben Jones (CERN)
    05/09/2018, 11:05
    HTCondor presentations and tutorials

    HTCondor has been the primary production batch service at CERN for the last couple of years, passing the 100k core mark last year. The challenge has been to scale the service, in terms of course of the number of resources, but also in terms of the number of heterogenous use cases. The use cases involve dedicated LHC Tier-0 pools, dedicated resources within standard pools, special CE routes to...

    Go to contribution page
  12. Gregory Thain (University of Wisconsin - Madison)
    05/09/2018, 11:30
    HTCondor presentations and tutorials

    This tutorial covers HTCondor's "Fair Share" mechanisms for assigning resources to users, configuring groups of users with quotas, and other aspects of global policy via the HTCondor negotiator.

    Go to contribution page
  13. Thomas Finnern (DESY)
    05/09/2018, 14:00
    HTCondor presentations and tutorials

    The talk provides some details of special DESY configurations. It focuses on features we need for user registry integration, node maintenance operations and fair share / quota handling. With the help of job transformations defining job classes and proper job duration and memory setting, we setup a smooth and transparent operating model.

    Go to contribution page
  14. Mr Nikolaos Petros Triantafyllidis (CERN)
    05/09/2018, 14:25
    HTCondor presentations and tutorials

    Haggis is an information system used to map CERN users to HTCondor accounting groups as well as hold information about quota and priority allocation per accounting group as well as information relevant to resource usage accounting. It enforces a tree-like domain model that supports resource mapping under different compute pools. All the data stored in Haggis is completely manageable by the...

    Go to contribution page
  15. James Frey (University of Wisconsin Madison (US))
    05/09/2018, 14:50
    HTCondor presentations and tutorials

    How HTCondor deals with network architecture difficulties.

    Go to contribution page
  16. John Knoeller (University of Wisconsin-Madison)
    05/09/2018, 15:25
    HTCondor presentations and tutorials

    Introduction to the HTCondor python bindings and their use to query HTCondor.

    Go to contribution page
  17. Dr Lukasz Kreczko (University of Bristol (GB))
    05/09/2018, 16:30
    HTCondor presentations and tutorials

    Configuring a condor cluster and keeping the configuration synchronised can be quite the chore. For this purpose, under the umbrella of HEP-Puppet, sysadmins have gathered to create a simple-to-use Puppet module. With just a few lines of YAML (hiera) you can configure your own HTCondor cluster within minutes (Puppet infrastructure provided). This talk will showcase the module with snippets...

    Go to contribution page
  18. John Knoeller (University of Wisconsin-Madison)
    05/09/2018, 16:55
    HTCondor presentations and tutorials

    Tutorial on using python to submit jobs to HTCondor, concentrating on the 8.7 series improvements in the HTCondor python bindings.

    Go to contribution page
  19. Miron Livny (University of Wisconsin-Madison)
    05/09/2018, 17:30
    HTCondor presentations and tutorials

    Miron Livny would like to lead a discussion on how to best interface with HTCondor when working inside a Python environment, especially an interactive science-based environment such as Jupyter Notebook / Lab. We have been experimenting with some approaches at UW-Madison that we can share, but what we are looking for an open discussion of ideas, feedback, and suggestions.

    Go to contribution page
  20. James Frey (University of Wisconsin Madison (US))
    06/09/2018, 09:00
    HTCondor presentations and tutorials

    Learn how the Annex allows you to seamless expand your HTCondor pool using machines from Amazon EC2.

    Go to contribution page
  21. Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)
    06/09/2018, 09:35
    HTCondor presentations and tutorials

    Based on current trends and past experience, this talk will identify and discuss six key challenge areas that will continue to drive High Throughput Computing technologies innovation in the years to come.

    Go to contribution page
  22. John Kelly (S)
    06/09/2018, 10:10
    HTCondor presentations and tutorials

    RAL Tier-1 originally used the PBS batch system for its Grid related activities. Increased LHC operation requirements exposed scalability problems, therefore other batch systems were taken into consideration.

    In this presentation we review the history of HTCondor at RAL and detail on how it evolved from an initial conventional setup with cgroups for resource control to current use of Docker...

    Go to contribution page
  23. Mr Stephen Jones (GridPP/Liverpool)
    06/09/2018, 11:05
    HTCondor presentations and tutorials

    HTCondor is a product, but it is not an application. Like operating systems, networks, database management systems, and security infrastructures, HTCondor is a general system, upon which other applications may be built.

    Extra work is needed to create something useful from HTCondor. The extra work depends on the goals of the designer. This talk identifies a few general areas that need to be...

    Go to contribution page
  24. Dr Andrew Lahiff (UKAEA)
    06/09/2018, 11:30
    HTCondor presentations and tutorials

    Access to both HTC and HPC facilities is vitally important to the fusion community, not only for plasma modelling but also for advanced engineering and design, materials research, rendering, uncertainty quantification and advanced data analytics for engineering operations. The computing requirements are expected to increase as the community prepares for the next generation facility, ITER....

    Go to contribution page
  25. John Knoeller (University of Wisconsin-Madison)
    06/09/2018, 11:55
    HTCondor presentations and tutorials

    Discussion of the language used by HTCondor for configuration and job submit files.

    Request 30 Minute time slot.

    Go to contribution page
  26. James Frey (University of Wisconsin Madison (US))
    06/09/2018, 14:00
    HTCondor presentations and tutorials

    DAGMan lets you manage large, complex workflows in HTCondor.

    Go to contribution page
  27. Todd Tannenbaum (University of Wisconsin Madison (US))
    06/09/2018, 14:35
    HTCondor presentations and tutorials

    We believe that distributed, scientific computing community has unique authorization needs that can be met by utilizing common web technologies, such as OAuth 2.0 and JSON Web Tokens (JWT). The SciTokens team, a collaboration between technology providers including the HTCondor Project and domain scientists, is working to build and demonstrate a new authorization approach at scale.

    Go to contribution page
  28. Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéti cas Medioambientales y Tecno)
    06/09/2018, 15:10
    HTCondor presentations and tutorials

    In recent times, the CMS HTCondor Global Pool, which unifies access and management to all CPU resources available to the experiment, has been growing in size and evolving in its complexity, as new resources and job submit nodes are being added to the design originally conceived to serve the collaboration during the LHC Run 2. Having achieved most of our milestones for this period, the pool...

    Go to contribution page
  29. Diego Davila Foyo (Autonomous University of Puebla (MX))
    06/09/2018, 15:35
    HTCondor presentations and tutorials

    Nowadays computational resources come in a wide variety of forms from pilots running on sites, cloud resources and spare cycles on desktops, laptops and even phones through volunteer computing and our duty, as the Submission Infrastructure team at CMS, is to be able to use them all.
    When it comes to Integrate these different models into a single pool of resources, different challenges arise....

    Go to contribution page
  30. Dr Dario Rodriguez Aseretto (European Commission)
    06/09/2018, 16:30
    HTCondor presentations and tutorials

    Geospatial data are one of the core data sources for scientific and technical support to the European Commission (EC) policies. For instance, the Copernicus programme of the European Union provides a vast amount of Earth Observation (EO) data for monitoring the environment through the Sentinel satellites operated by the European Space Agency. In terms of data management and processing, big...

    Go to contribution page
  31. John Knoeller (University of Wisconsin-Madison)
    06/09/2018, 16:55
    HTCondor presentations and tutorials

    Discussion of policy expressions available to users when the submit their HTCondor jobs, and expressions available to Administrators when they configure HTCondor execute nodes. Time permitting, there will be a demonstration of special purpose execution slots.

    Request 60 Minute slot.

    Go to contribution page
  32. Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))
    06/09/2018, 17:40
    HTCondor presentations and tutorials

    In 2013 the RAL Tier-1 switched its batch farm to using HTCondor. In the years following, several more UK sites have made the switch. The RAL Tier-1 batch farm is now well over 20000 job slots and HTCondor is a key service delivering our pledged resources to the WLCG, now and for the forseeable future.

    New funding opportunities are available to provide computing in the UK to the "long tail"...

    Go to contribution page
  33. Todd Tannenbaum (University of Wisconsin Madison (US))
    07/09/2018, 09:00
    HTCondor presentations and tutorials

    An overview of monitoring an HTCondor pool

    Go to contribution page
  34. Gregory Thain (University of Wisconsin - Madison)
    07/09/2018, 09:25
    HTCondor presentations and tutorials

    Overview of HTCondor's mechanisms in support of job isolation, including Docker, Singularity, cgroups, and namespace mounts.

    Go to contribution page
  35. Francesco Prelz (Università degli Studi e INFN Milano (IT))
    07/09/2018, 10:00
    HTCondor presentations and tutorials

    A setup to share clusters that used to be owned and operated by experimental and theory sub-groups in the Physics Department of the University of Milan is described. Each sub-cluster is configured as a separate Condor Pool, reporting to one additional 'super'-collector. With a few assumptions on the available execution environment, plus mutually agreed priorities for 'local' jobs, this allows...

    Go to contribution page
  36. Dr Paul Hopkins (Cardiff University)
    07/09/2018, 10:55
    HTCondor presentations and tutorials

    All members of the LIGO Scientific Collaboration have access to a handful of dedicated LIGO Data Grid clusters which feature HTCondor, system-installed software, the LIGO and Virgo data, and other standard components. Cardiff University also host a LIGO Data Grid Site, but this is built on top of the shared institutional HPC cluster. In this talk I describe how I used HTCondor, Spack,...

    Go to contribution page
  37. John Knoeller (University of Wisconsin-Madison)
    07/09/2018, 11:20
    HTCondor presentations and tutorials

    Discussion of the Job Transform language in the HTCondor Schedd.

    Request 30 Minute time slot.

    Go to contribution page
  38. Helge Meinhard (CERN)
    07/09/2018, 11:55
    HTCondor presentations and tutorials
Building timetable...