15–19 Sept 2025
FZU - Institute of Physics of the Czech Academy of Sciences
Europe/Prague timezone

Operating HTCondor for CPU and GPU Workloads at IFIC

17 Sept 2025, 14:40
20m
FZU - Institute of Physics of the Czech Academy of Sciences

FZU - Institute of Physics of the Czech Academy of Sciences

Pod Vodárenskou věží 2531/3 Praha 8 Czechia
HTCondor user presentations Workshop Session

Speaker

Miguel Folgado (IFIC / CSIC-UV)

Description

The Instituto de Física Corpuscular (IFIC) is a joint research center of the Spanish National Research Council (CSIC) and the University of Valencia, focused on fundamental physics, from particle physics to cosmology. It hosts over 400 researchers, engineers, and technical staff working on national and international projects.

In this talk, we will present how IFIC manages two compute clusters, GLUON (CPU) and Artemisa (GPU), using HTCondor. These clusters serve both internal users and external collaborators, and support a wide range of workloads, from classical simulations to deep learning applications. We will describe the general architecture of each pool, our strategies for efficient GPU and CPU resource allocation, the management of usage policies and priorities, as well as some lessons learned from operating a hybrid infrastructure.

Additionally, we will describe how we handle parallel jobs over InfiniBand in GLUON alongside traditional serial jobs through HTCondor’s vanilla universe.

Desired slot length 20 minutes
Speaker release Yes

Authors

Javier Sanchez (Universidad de Valencia (ES)) Mr Matias Salinero Delgado (IFIC / CSIC-UV) Miguel Folgado (IFIC / CSIC-UV)

Presentation materials