10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

Building a Regional Computing Grid for the University of California at 100gbps

13 Oct 2016, 14:00
15m
Sierra B (San Francisco Mariott Marquis)

Sierra B

San Francisco Mariott Marquis

Oral Track 6: Infrastructures Track 6: Infrastructures

Speaker

Jeffrey Michael Dost (Univ. of California San Diego (US))

Description

The Pacific Research Platform is an initiative to interconnect Science DMZs between campuses across the West Coast of the United States over a 100 gbps network. The LHC @ UC is a proof of concept pilot project that focuses on interconnecting 6 University of California campuses. It is spearheaded by computing specialists from the UCSD Tier 2 Center in collaboration with the San Diego Supercomputer Center. A machine has been shipped to each campus extending the concept of the Data Transfer Node to a “cluster in a box” that is fully integrated into the local compute, storage, and networking infrastructure. The node contains a full HTCondor batch system, and also an XRootD proxy cache. User jobs routed to the DTN can run on 40 additional slots provided by the machine, and can also flock to a common GlideinWMS pilot pool, which sends jobs out to any of the participating UCs, as well as to Comet, the new supercomputer at SDSC. In addition, a common XRootD federation has been created to interconnect the UCs and give the ability to arbitrarily export data from the home university, to make it available wherever the jobs run. The UC level federation also statically redirects to either the ATLAS FAX or CMS AAA federation respectively to make globally published datasets available, depending on end user VO membership credentials. XRootD read operations from the federation transfer through the nearest DTN proxy cache located at the site where the jobs run. This reduces wide area network overhead for subsequent accesses, and improves overall read performance. Details on the technical implementation, challenges faced and overcome in setting up the infrastructure, and an analysis of usage patterns and system scalability will be presented.

Primary Keyword (Mandatory) Computing facilities
Secondary Keyword (Optional) Computing middleware
Tertiary Keyword (Optional) Distributed data handling

Primary author

Jeffrey Michael Dost (Univ. of California San Diego (US))

Co-authors

Alja Mrak Tadel (Univ. of California San Diego (US)) Edgar Fajardo Hernandez (Univ. of California San Diego (US)) Frank Wurthwein (UCSD) Matevz Tadel (Univ. of California San Diego (US)) Terrence Martin (UCSD) Tyson Jones (Monash)

Presentation materials