21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Using Xrootd to Federate Regional Storage

22 May 2012, 14:20
25m
Room 914 (Kimmel Center)

Room 914

Kimmel Center

Parallel Computer Facilities, Production Grids and Networking (track 4) Computer Facilities, Production Grids and Networking

Speaker

Brian Paul Bockelman (University of Nebraska (US))

Description

While the LHC data movement systems have demonstrated the ability to move data at the necessary throughput, we have identified two weaknesses: the latency for physicists to access data and the complexity of the tools involved. To address these, both ATLAS and CMS have begun to federate regional storage systems using Xrootd. Xrootd, referring to a protocol and implementation, allows us to provide data access to all disk-resident data from a single virtual endpoint. This "redirector" endpoint (which may actually be multiple physical hosts) discovers the actual location of the data and redirects the client to the appropriate site. The approach is particularly advantageous since typically the redirection requires much less than 500 milliseconds (bounded by network round trip time) and the Xrootd client is conveniently built into LHC physicist's analysis tools. Currently, there are three regional storage federations - a US ATLAS region, a European CMS region, and a US CMS region. The US ATLAS and US CMS regions include their respective Tier 1 and Tier 2 facilities, meaning a large percentage of experimental data is available via the federation. There are plans for federating storage globally and so studies of the peering between the regional federations is of particular interest. From the base idea of federating storage behind an endpoint, the implementations and use cases diverge. For example, the CMS software framework is capable of efficiently processing data over high-latency data, so using the remote site directly is comparable to accessing local data. ATLAS's processing model is currently less resilient to latency, and they are particularly focused on the physics n-tuple analysis use case; accordingly, the US ATLAS region relies more heavily on caching in the Xrootd server to provide data locality. Both VOs use GSI security. ATLAS has developed a mapping of VOMS roles to specific filesystem authorizations, while CMS has developed callouts to the site's mapping service. Each federation presents a global namespace to users. For ATLAS, the global-to-local mapping is based on a heuristic-based lookup from the site's local file catalog, while CMS does the mapping based on translations given in a configuration file. We will also cover the latest usage statistics and interesting use cases that have developed over the previous 18 months.

Primary authors

Brian Paul Bockelman (University of Nebraska (US)) Robert GARDNER (UNIVERSITY OF CHICAGO)

Co-authors

Andrew Hanushevsky (STANFORD LINEAR ACCELERATOR CENTER) Prof. Avi Yagil (Univ. of California San Diego (US)) Daniel Charles Bradley (High Energy Physics) Mr David Lesny (Univ. Illinois at Urbana-Champaign (US)) Doug Benjamin (Duke University (US)) Frank Wurthwein (UCSD) Giacinto Donvito (Universita e INFN (IT)) Hironori Ito (Brookhaven National Laboratory (US)) Dr Horst Severini (University of Oklahoma (US)) Mr Igor Sfiligoi (University of California San Diego) Kenneth Bloom (University of Nebraska (US)) Dr Lothar Bauerdick (FERMILAB) Matevz Tadel (Univ. of California San Diego (US)) Michael Ernst (Unknown) Dr Ofer Rind (BROOKHAVEN NATIONAL LABORATORY) Patrick Mcguigan (University of Texas at Arlington (US)) Sarah Williams (Indiana University (US)) Dr Shawn Mc Kee (University of Michigan (US)) Prof. Sridhara Dasu (University of Wisconsin (US)) Wei Yang (SLAC National Accelerator Laboratory (US))

Presentation materials