The CMS Dataset Bookkeeping Service

Sep 3, 2007, 3:40 PM
Dr Lee Lueking (FERMILAB)


The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It includes the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connecting via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPs with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems. The system has been in operation since March 2007, an overview of the schema, functionality, deployment details, operational statistics and experience will be presented.
Dr Lee Lueking (FERMILAB)


Dr Andrew Dolgert (Cornell University) Anzar Afaq (Fermilab) Dr Chris Jones (Cornell University) Dr Dan Riley (Cornell University) Sergey Kosyakov (Fermilab) Dr Valentin Kuznetsov (Cornell University) Vijay Sekhri (Fermilab) Yuyi Guo (Fermilab)

