CHEP 2016 Conference, San Francisco, October 8-14, 2016

Name: CHEP 2016 Conference, San Francisco, October 8-14, 2016
Start: 2016-10-10T08:00:00-07:00
End: 2016-10-14T18:00:00-07:00
Location: San Francisco Marriott Marquis

10–14 Oct 2016

San Francisco Marriott Marquis

America/Los_Angeles timezone

Software and Experience with Managing Workflows for the Computing Operation of the CMS Experiment

11 Oct 2016, 11:00

15m

GG C2 (San Francisco Mariott Marquis)

GG C2

San Francisco Mariott Marquis

Oral Track 3: Distributed Computing Track 3: Distributed Computing

Dr Jean-Roch Vlimant (California Institute of Technology (US))

We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to computing resources available on the CMS grid, and delivering the output to the Physics groups. Automatization is critical above a certain number of requests to be handled, especially in the view of using more efficiently computing resources and reducing latencies. An effort to automatize the necessary steps for production and reprocessing recently started and a new system to handle workflows has been developed. The state-machine system described consists in a set of modules whose key feature is the automatic placement of input datasets, balancing the load across multiple sites. By reducing the operation overhead, these agents enable the utilization of more than double the amount of resources with robust storage system. Additional functionalities were added after months of successful operation to further balance the load on the computing system using remote read and additional ressources. This system contributed to reducing the delivery time of datasets, a crucial aspect to the analysis of CMS data. We report on lessons learned from operation towards increased efficiency in using a largely heterogeneous distributed system of computing, storage and network elements.

Primary Keyword (Mandatory)	Data processing workflows and frameworks/pipelines
Secondary Keyword (Optional)	Distributed workload management

Dr Jean-Roch Vlimant (California Institute of Technology (US))

Highlights-396.pdf

Oral-396.pdf

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Software and Experience with Managing Workflows for the Computing Operation of the CMS Experiment

GG C2

San Francisco Mariott Marquis

Speaker

Description

Author

Presentation materials