Computing in High Energy and Nuclear Physics (CHEP) 2012

Name: Computing in High Energy and Nuclear Physics (CHEP) 2012
Start: 2012-05-21T06:00:00-04:00
End: 2012-05-25T18:00:00-04:00
Location: New York City, NY, USA

21–25 May 2012

New York City, NY, USA

US/Eastern timezone

Support

chep2012@bnl.gov

Implementing data placement strategies for the CMS experiment based on a popularity mode

24 May 2012, 17:25

25m

Eisner & Lubin Auditorium (Kimmel Center)

Eisner & Lubin Auditorium

Kimmel Center

Parallel Distributed Processing and Analysis on Grids and Clouds (track 3) Distributed Processing and Analysis on Grids and Clouds

Dr Domenico Giordano (CERN) Fernando Harald Barreiro Megino (Universidad Autonoma de Madrid (ES))

During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of data and processed and analyzed it on the distributed, multi-tiered computing infrastructure on the WorldWide LHC Computing Grid. Given the increasing data volume that has to be stored and efficiently analyzed, it is a challenge for several LHC experiments to optimize and automate the data placement strategies in order to fully profit of the available network and storage resources and to facilitate daily computing operations. Building on previous experience acquired by ATLAS, we have developed the CMS Popularity Service that tracks file accesses and user activity on the grid and will serve as the foundation for the evolution of their data placement. A fully automated, popularity-based site-cleaning agent has been deployed in order to scan Tier2 sites that are reaching their space quota and suggest obsolete, unused data that can be safely deleted without disrupting analysis activity. Future work will be to demonstrate dynamic data placement functionality based on this popularity service and integrate it in the data and workload management systems: as a consequence the pre-placement of data will be minimized and additional replication of hot datasets will be requested automatically. This paper will give an insight into the development, validation and production process and will analyze how the framework has influenced resource optimization and daily operations in CMS.

Daniele Spiga (CERN) Dr Domenico Giordano (CERN) Dr Edward Karavakis (CERN) Fernando Harald Barreiro Megino (Universidad Autonoma de Madrid (ES)) Dr Maria Girone (CERN) Mattia Cinquilli (Univ. of California San Diego (US)) Nicolo Magini (CERN) Valentina Mancinelli (Sezione di Perugia (INFN)-Universita e INFN)

Slides

CMS_DataPopularity_CHEP12_giordano.pdf

Video in CDS

Computing in High Energy and Nuclear Physics (CHEP) 2012

Support

Implementing data placement strategies for the CMS experiment based on a popularity mode

Eisner & Lubin Auditorium

Kimmel Center

Speakers

Description

Primary authors

Presentation materials

Choose timezone

Computing in High Energy and Nuclear Physics (CHEP) 2012

Support

Speakers

Description

Primary authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring