Speaker
Manuel Giffels
(CERN)
Description
The Data Bookkeeping Service 3 (DBS 3) provides an improved event meta data catalog for Monte Carlo and recorded data of the CMS (Compact Muon Solenoid) experiment at the Large Hadron Collider (LHC). It provides the necessary information used for tracking datasets, like data processing history, files and runs associated with a given dataset on a scale of about 10^5 datasets and more than 10^7 files. All kinds of data processing in CMS are relying on the information stored in DBS. It is widely used within CMS, in Monte Carlo production, processing of recorded data as well as in physics analysis done by users.
DBS 3 has been completely re-designed and re-implemented in Python using a CherryPy based environment, utilizing RESTful (Representational State Transfer) web services, commonly used within the data management and workload management (DMWM) group of CMS. DBS 3 is using the Java Script Object Notation (JSON) dataformat for interchanging information and Oracle as database backend. Main focuses during the process of development were an adaptation of the database schema to better match the evolving CMS data processing model, the introduction of the Data Aggregation System in CMS, which is combining the information of a variety of database services (PhEDEx, SiteDB, DBS, etc.) in one user interface and the achievement of a better scalability to match the growing demands even in the future.
This contribution covers the design of the service, the results of recent stress and scale testing, as well as first experiences with the system during daily operations.
Primary authors
Manuel Giffels
(CERN)
Yuyi Guo
(Fermi National Accelerator Lab. (US))