9-13 July 2018
Sofia, Bulgaria
Europe/Sofia timezone

Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool

10 Jul 2018, 16:00
1h
Sofia, Bulgaria

Sofia, Bulgaria

National Culture Palace, Boulevard "Bulgaria", 1463 NDK, Sofia, Bulgaria
Poster Track 3 – Distributed computing Posters

Speaker

Kenyi Paolo Hurtado Anampa (University of Notre Dame (US))

Description

The CMS experiment has an HTCondor Global Pool, composed of more than 200K CPU cores available for Monte Carlo production and the analysis of data. The submission of user jobs to this pool is handled by either CRAB3, the standard workflow management tool used by CMS users to submit analysis jobs requiring event processing of large amounts of data, or by CMS Connect, a service focused on final stage condor-like analysis jobs and applications that already have a workflow job manager in place. The latest scenario can bring cases in which workflows need further adjustments in order to efficiently work in a globally distributed pool of resources. For instance, the generation of matrix elements for high energy physics processes via Madgraph5_aMC@NLO and the usage of tools not (yet) fully supported by the CMS software, such as TensorFlow with GPU support, are tasks with particular requirements. A special adaption, either at the pool factory level (advertising GPU resources) or at the execute level (e.g: to handle special parameters that describe certain needs for the remote execute nodes during submission) is needed in order to adequately work in the CMS global pool. This contribution describes the challenges and efforts performed towards adapting such workflows so they can properly profit from the Global Pool via CMS Connect.

Primary authors

Kenyi Paolo Hurtado Anampa (University of Notre Dame (US)) Edgar Fajardo Hernandez (Univ. of California San Diego (US))

Co-authors

Amjad Kotobi (University of Malaya (MY)) Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéti cas Medioambientales y Tecno) Brian Paul Bockelman (University of Nebraska Lincoln (US)) David Alexander Mason (Fermi National Accelerator Lab. (US)) Diego Davila Foyo (Autonomous University of Puebla (MX)) Farrukh Aftab Khan (Fermi National Accelerator Lab. (US)) James Letts (Univ. of California San Diego (US)) Krista Larson (Fermi National Accelerator Lab. (US)) Marco Mascheroni (Univ. of California San Diego (US)) Todor Trendafilov Ivanov (University of Sofia (BG))

Presentation Materials