- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
The workshop, third one in Europe after the successful workshops at CERN in December 2014 and ALBA in February 2016, was an opportunity for novice and experienced users of HTCondor to learn, get help and have exchanges between them and with the HTCondor developers and experts. It was primarily addressed at users from EMEA. The workshop consisted of presentations, tutorials and "office hours" for consultancy. The HTCondor CE (Compute Element) was covered as well.
The workshop was hosted by DESY (Deutsches Elektronen-Synchrotron) on its site in Hamburg, Germany.
tbd
Usual welcome speech with some overview of DESY's activities
Some points worth emphasising
A brief introduction to HTCondor from a user perspective, covering how to submit and manage jobs and workflows.
tbd
An overview of how to administer an HTCondor pool.
In the context of the European Union(EU) Regulation on Genetically Modified Organisms (GMOs), validation of Polymerase Chain Reaction (PCR-)based detection methods is a fundamental task of the European Union Reference Laboratory for Geneticallly Modified Food and Feed (EU-RL GMFF) that requires to integrate the combination of both the experimental approach and of bioinformatics analyses for sequence similarity searches. In addition, the EU-RL GMFF may be requested by Commission Services to evaluate the specificity level of methods developed to detect GMOs other than those submitted under the ‘Food and Feed’ regulations, such as GMOs intended to be released in the environment or not approved GMOs for which emergency measures are issued.
Within this framework, here we show the implementation of METSCAN, a tool that allows the performance of complex bioinformatics analyses on the specificity of PCR-based detection methods. METSCAN relies on the power of a High Performance Computing cluster which runs HTCondor scheduler, through a simple user interface. METSCAN has the objective to make in silico predictions on the detection methods’ specificity, to direct recommendations on the need or not for experimental testing of method specificity and, in case, to define what source of DNA, e.g. vector, plant species or other GMOs can cross-react with each detection method.
Monitoring of the food chain to fight fraud and protect consumer health relies on the availability of methods to correctly identify the species present in samples, for which DNA barcoding is a promising candidate. The nuclear genome is a rich potential source of barcode targets, but has been relatively unexploited until now. We have developed a CPU high demanding bioinformatic pipeline that processes available genome sequences to automatically screen large numbers of input candidates, identifies novel nuclear barcode targets and designs associated primer pairs, according to a specific set of requirements. By using the High Performance Computing cluster which runs HTCondor scheduler, we have implemented the fast execution of this complex task to tackle specifically fish fraud. The obtained results have been analysed in silico and tested in laboratory to efficiently identify flatfishes of the Pleuronectidae family. In addition, by using in silico methods, a dataset of fish barcode reference sequences from the ever-growing wealth of publicly available sequence information has been generated, to facilitate and speedup, labour-intensive laboratory work. The short lengths of these new barcodes target regions render their analysis ideally suited to next-generation sequencing techniques, allowing characterisation of multiple fish species in mixed and processed samples. Their location in the nucleus also improves currently used methods by allowing the identification of hybrid individuals.
tbd
Learn how to configure job- and machine-specific policy.
This talk describes how IceCube uses HTCondor and an in-house glidein system to consolidate computing resources from a diverse set of facilities.
Veselin Vasilev, Dario Rodriguez, Armin Burger and Pierre Soille
European Commission, Joint Research Centre (JRC)
Directorate I. Competences. Unit I.3 Text and Data Mining
Via E. Fermi 2749, I-21027 Ispra (Va), Italy
The Copernicus programme1 of the European Union is delivering massive amounts of satellite image data of interest to a range of European policies supported by activities of the JRC. In this context, the JRC faces the challenges of storage and processing of Earth Observation data at Petabyte-scale. This led to the design and development of the JRC Earth Observation Data and Processing Platform (JEODPP) [1]. In order to address the needs for high data throughput and scalability, multi-purpose usage, as well as budgetary and data accessibility constraints, an implementation based on commodity hardware and open source solutions was developed. The infrastructure consists of a processing cluster, built upon a scalable set of processing nodes, and a storage cluster which sits on top of a Just a Bunch Of Disks (JBODs) attached to dedicated storage nodes. HTCondor was chosen as a processing workload manager for its maturity to work in Docker universe. As storage layer, CERN’s in-house developed storage solution EOS2 was chosen. It fits the requirements for scientific data processing in a cluster environment, scales well and integrates into an existing Kerberos realm for data access management. The current set-up, of gross 1.8 PB, has the processing nodes (37 nodes for a total of 952 CPUs) mounting the storage using EOS’s own FUSE client as wrapper around XRootD software framework. This allows a unified POSIX-like data access that grants clients to run applications in a performant cluster environment without any special modification. The usage of Docker universe for HTCondor allows the flexible handling of very diverse processing environments developed over time by various JRC projects. Using HTCondor as a job scheduler for interactive web processing requests is among the future challenges for the setup.
[1] P. Soille, A. Burger, D. Rodriguez, V. Syrris, and V. Vasilev. Towards a JRC Earth observation data and processing platform. In P. Soille and P.G. Marchetti, editors, Proc. of the 2016 Conference on Big Data from Space (BiDS'16), pages 65{68. Publications O_ce of the European Union, 2016. doi: http://dx.doi.org/10.2788/854791.
----------------------------------
tbd
Learn how HTCondor matches jobs to machines, and how to control it.
See how Azure and HTCondor are being used by TIFR for CMS.
tbd
After running for more than one year Grid resources in production on
HTCondor, we present our experiences on the HTCondor+ARCCE ecosystem.
Migrating existing resources from the previous batch system showed to be
easy to handle and is reassuring us about the scaling properties of
HTCondor.
In the last year, HTCondor has been enhanced with mechanisms to allow for kerberos tokens to be renewed. This is achieved in HTCondor with hooks for security credential producers, and security credential monitors. I propose to talk about using these methods, what we integrate with to handle the monitoring and renewal of kerberos tokens at CERN, operational experience in doing so, what we actually need/use it for, and how we might look at further developing this feature in the future.
Local batch at DESY today is relying on afs $HOMEs heavily. For a transparent migration from SGE to HTCondor we want to make use of the Kerberos integration that is provided in terms of condor hooks in the current developer release of Condor. This talk will describe the tools we use to monitor and renew the kerberos tokens
in our condor environment.
This talk will detail the current status of the CERN migration from LSF to HTCondor. We will cover the grid and the HTCondor-CE, local submissions, monitoring/dashboards, feedback from users and the current progress.
Walk in and talk to the HTCondor and HTCondorCE developers
tbd
Covers how to modify jobs during submission and deny unsuitable jobs.
A hands-on introduction to using HTCondor in Python.
An overview of recent developments and future plans in HTCondor.
tbd
Singularity linux containerization in HTCondor
A collection of little-known features added to HTCondor recently.
At the RAL Tier-1 we have been running HTCondor in production for 4 years. Currently the pool contains over 22000 cores. This talk will present an update on recent activity, including migration to the Docker universe, and future plans.
Purpose:
The objective of this project is to optimize Big Data (BD) workload scheduling, using a hybrid framework (dedicated and non-dedicated) that blends the best of both Hadoop YARN and HTCondor worlds in a single analytical environment.
Method:
The proposed OPERA-P, short for OPportunistically, Elastically Resource Allocation and Provisioning scheduler, is a new hybrid BD platform that combines High-Throughput and High-Performance Computing, i.e., HTCondor and Yarn (see Figure 1). By utilizing OPERA-P, an HTCondor opportunistic pool and an Apache Yarn dedicated cluster can collaborate, and we can achieve an enhanced tasks throughput, for the benefits of BD applications, with minimal cost of deployment. This model is very similar to how multiple applications run concurrently on a laptop or smartphone. In that, new threads are spawned, and more resources are asked as they are needed; consequently, the OS arbitrates among all of the requests. In comparison, OPERA-P will represent the OS, by keep spawn new Docker containers among the idle HTCondor workstations (creating an opportunistic container-based cluster on the HTCondor pool) and ensures efficiently provisioning for the Hadoop dedicated cluster on-demand.
Conclusion:
OPERA-P is an enabling technology that can be used to take advantage of leveraging all of the resources within an enterprise or cloud as a single pool of resources, to achieve full flexibility, scalability, and elasticity provisioning on-demand. OPERA-P provides a seamless bridge from the pool of resources available in HTCondor to the YARN tasks that want those resources. In the presentation, we will discuss further our project and the ongoing efforts behind it. Also, we will discuss OPERA-P design, challenges, and the prototype opportunities.
tbd
The Open Science Grid is a partnership for a data-intensive research, focusing on technique distributed high-throughput computing (DHTC). At its core, the OSG often utilizes HTCondor to meet its distributed computing needs.
Within the OSG, HTCondor is used as a compute element, as a piece of the monitoring system, for pilot submission, for an information service -- and yes, for scheduling job workflows!
In this talk, we will provide an overview of OSG's technology offerings and how the OSG - and its sites, VOs, and users - utilize HTCondor.
Advanced features in HTCondor for job submission.
A review of CMS experiment’s HTCondor Global Pool operations during the past year, including the challenges and solutions integrated into new HTCondor features which allowed us to reach scales of over 200K CPU cores world-wide.
GlideinWMS factory is a part of the CERN CMS global pool for submitting glideins to the grid sites, glideins work as placeholders, glideinWMS factory works on top of the HTCondor, when glideins reserve resources HTCondor scheduling jobs to grid sites.
tbd
How HTCondor deals with network architecture difficulties.
The HTCondor-CE provides a remote API on top of a local site batch system. The HTCondor-CE works best when paired with HTCondor, but it can provide a generic interface to other batch systems such as SLURM, PBS, or LSF. Originally developed within the Open Science Grid, the CE is now in use in sites around the world - from the smallest to largest of scales!
In this presentation, we will give an update on the project's progress and goals from the past year.
Dario Rodriguez, Veselin Vasilev, and Pierre Soille
European Commission, Joint Research Centre (JRC)
Directorate I. Competences. Unit I.3 Text and Data Mining
Via E. Fermi 2749, I-21027 Ispra (Va), Italy
Scientific and technical support to policies ranging from environment to emergency situations often require the analysis of large amount of Earth Observation data. For instance, the Copernicus programme of the European Union with its fleet of satellites managed by the European Space Agency will generate about 10 terabytes for satellite image data per day when it will be in full operational capacity. Numerous projects at the Joint Research Centre rely on the processing of large amounts of Earth Observation data. This motivated the development of a flexible and cost-effective petabyte-scale data and processing platform called the JRC Earth Observation Data and Processing Platform [1]. It is based on commodity hardware and CERN EOS storage backend and is detailed in [2]. Scientific workflows for geospatial data analysis are based on a variety of software, libraries, and tools that are often applied to numerous input data sets. A workload manager is therefore essential to manage multiple jobs at the same time. There are several open source queuing programs that could be used to meet various scheduling and networking needs. While HTCondor can manage very well large environments of heterogeneous resources, it was selected for the JEODPP for its unique ability to manage many different types of environments where the jobs may behave better according to the computing environment. The choice of a given environment depends basically on the type of runtime, available resources, and relationship of inter-dependencies between processes. Currently, HTCondor provides more than nine different environments called `universes', each of which enables users to take advantage of the scheduling in a unique way. In this presentation, the HTCondor universes Vanilla, Docker, and Parallel are explored and illustrated for 3 different Earth Observation applications as briefly described hereafter:
The Global Human Settlement (GHS) framework produces global spatial information about the human presence on the planet over time [3]. This in the form of built upmaps, population density maps and settlement maps. The input satellite images covering most of the landmass are processed fully automatically using HTCondor Vanilla and Docker universe in a single core for one job. Each job executes MATLAB program in a runtime environment that generates analytics and knowledge reporting objectively and systematically about the presence of population and built-up infrastructure 4 . This kind of application is a perfect candidate for HTCondor because they are loosely coupled (embarrassingly parallel). It means that there is no dependency and communication between tasks. Many of the applications running on our platform share this characteristic. The experiments conducted for this application show that the Docker universe introduces very little overhead compared to the Vanilla universe. However, the Docker universe has the advantage of allowing the existence of multiple isolated user-space (Docker containers) coping with possibly conflicting software library requirements.
The SUMO (Search for Unidentified Maritime Objects) is an automatic ship detection software from SAR imagery based on JAVA 5. For this application, all the Sentinel-1 satellite images acquired over the Mediterranean Sea over the period of 1 year were processed6. A Docker image containing all the software dependencies to run SUMO was built and served as a basis to launch the jobs through HTCondor with the Docker universe. In addition, this application takes advantage of the HTCondor dynamic slots allowing the execution using multi-threads given the high random-access memory requirements of SUMO. Basically, one multicore application scales up to the maximum number of physical CPUs (hyperthreading in our platform was disabled) on a worker host if has not enough memory to accommodate more than one process.
Hydrodynamic and ecosystem simulations over the Mediterranean Sea are used to assess the marine environment in the EU, to set baselines, identify data gaps and simulate scenarios. The hydrodynamic models and ecosystem model used in this application are GETM1, GOTM2, FABM3 and ERGOM. They are based on numerical methods dealing with the spatial domain by separating it into numerous components through a discretization process that produces a model grid 7. These models are implemented as MPI application based on FORTRAN and it is running by using the parallel universe of HTCondor. One job is executed in two steps: First, it triggers the startup of a virtual HPC cluster based on Docker containers and; Second, the job run the working script on the virtual HPC Cluster by using OpenMPI. In this form, we can deploy a virtual HPC cluster on demand under the umbrella of HTCondor.
In the near future, the potential of the HTCondor grid universe will be investigated to benefit from other platforms holding complementary resources.
[1] P. Soille, A. Burger, D. Rodriguez, V. Syrris, and V. Vasilev.; Towards a JRC earth observation data and processing platform Proc. of the 2016 Conference on Big Data fromSpace (BiDS'16), pages 65-68, 2016.. Available from doi: 10.2788/854791
[2] V. Vasilev, Rodriguez. D., A. Burger, and P. Soille; Flexible and cost-efective petabyte-scale architecture with HTCondor processing and EOS storage backend for Earth observation applications. In Abstract book of 3rd European HTCondor, Hamburg, Germany, 2017. DESY. Submitted.
[3] M. Pesaresi, G. Huadong, X. Blaes, D. Ehrlich, S. Ferri, L. Gueguen, M. Halkia, M. Kaumann, T. Kemper, L. Lu, M. Marin-Herrera, G. Ouzounis, M. Scavazzon, P. Soille, V. Syrris, and L. Zanchetta. A global human settlement layer form optical HR/VHR RS data: concepts and results. 6(5):2102{2131, 2013. doi: 10.1109/JSTARS.2013.2271445
4 C. Corbane, M. Pesaresi, V. Syrris, T. Kemper, P. Politis, P. Soille, A. Florczyk, F. Sabo, D. Rodriguez, L. Maffenini, and S. Ferri. Global mapping of human settlements with Sentinel-1 and Sentinel-2 data: Recent developments in the Global Human Settlement Layer. Slides of presentation at WorldCover'2017, ESA, Frascati, Italy, March 2017. Available from http://worldcover2017.esa.int/files/2.2-p1.pdf
5 C. Santamaria, M. Stasolla, V. Fernandez Arguedas, P. Argentieri, M. Alvarez, and H. Greidanus. Sentinel-1 Maritime Surveillance: Testing and Experiences with Long-term Monitoring. Technical Report EUR 27591 EN, 2015
6 C. Santamaria, M. Alvarez, H. Greidanus, V. Syrris, P. Soille, and P. Argentieri. Mass processing of Sentinel-1 images for maritime surveillance. Remote Sensing, 2017. Submitted.
7 D. Macias, E. Garcia-Gorriz, A. Dosio, A. Stips, and K. Keuler; Obtaining the correct sea surface temperature: bias correction of regional climate model data for the Mediterranean Sea. Climate Dynamics, pages 1-23, 2016. Available from doi: 10.1007/s00382-016-3049-z
----------------------------------
This talk summarizes how the LIGO Scientific and Virgo collaborations
hunt for gravitational waves - tiny ripples in space-time created by
some of the most energetic events in the Universe. We will discuss
high-power lasers, extremely precise phase/distance measurements,
Einstein's equations, data analysis models and work-flows, largish
computing sites, Open Science Grid (OSG), and finally arrive at the
answer of how HTCondor's 8.6 release made the admins' lives a lot easier
to figure about individual pipelines, computing sites used and computing
power requirements.
Directions: https://indico.cern.ch/event/611296/attachments/1470190/2274605/WorkshopDinnerThuJune_8th.pdf
Walk in and talk to the HTCondor and HTCondorCE developers
tbd
An overview of authentication and authorization in HTCondor.
A tool to easily expand an HTCondor pool via the Cloud.
A look at how the HTCondor pool at CHTC is transitioning from RHEL6 to RHEL7 with minimal user headaches.
tbd
A look at the scheduling policies used by CHTC at UW-Madison.
How to keep an eye on your HTCondor pool.
How to identify and fix things that frequently go wrong.
Usual closing speech
Walk in and talk to the HTCondor and HTCondorCE developers