Optimizing CMS build infrastructure via Apache Mesos

Not scheduled
15m
OIST

OIST

1919-1 Tancha, Onna-son, Kunigami-gun Okinawa, Japan 904-0495
poster presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

Speaker

Mr Giulio Eulisse (Fermi National Accelerator Lab. (US))

Description

The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on an relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.

Primary author

Mr Giulio Eulisse (Fermi National Accelerator Lab. (US))

Co-authors

Alessandro Degano (Universita e INFN (IT)) David Abdurachmanov (Vilnius University (LT)) David Gonzalo Mendez Lopez (Universidad de los Andes (CO)) Dr Peter Elmer (Princeton University (US)) Shahzad Malik Muzaffar (Fermi National Accelerator Lab. (US))

Presentation materials