21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Web enabled data management with DPM & LFC

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Distributed Processing and Analysis on Grids and Clouds (track 3) Poster Session

Speakers

Alejandro Alvarez Ayllon (University of Cadiz) Ricardo Brito Da Rocha (CERN)

Description

The Disk Pool Manager (DPM) and LCG File Catalog (LFC) are two grid data management components currently used in production at more than 240 sites. Together with a set of grid client tools they give the users a unified view of their data, hiding most details concerning data location and access. Recently we've put a lot of effort in developing a reliable and high performance HTTP/WebDAV frontend to both our grid catalog and storage components, exposing the existing functionality to users accessing the services via standard clients - e.g. web browsers, curl - present in all operating systems, giving users a simple and straigh-forward way of interaction. In addition, as other relevant grid storage components (like dCache) expose their data using the same protocol, for the first time we had the opportunity of attempting a unified view of all grid storage using HTTP. We describe the mechanism used to integrate the grid catalog(s) with the multiple storage components - HTTP redirection -, including details on some assumptions made to allow integration with other implementations. We describe the way we hide the details regarding site availability or catalog inconsistencies, by switching the standard HTTP client automatically between multiple replicas. We also present measurements of access performance, and the relevant factors regarding replica selection - current throughput and load, geographic proximity, etc. Finally, we report on some additional work done to have this system as a viable alternative to GridFTP, providing multi-stream transfers and exploiting some additional features of WebDAV to enable third party copies - essential for managing data movements between storage systems - with equivalent performance.

Primary authors

Presentation materials