10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

One network metric datastore to track them all: The OSG Network Service

11 Oct 2016, 11:45
15m
Sierra B (San Francisco Mariott Marquis)

Sierra B

San Francisco Mariott Marquis

Oral Track 6: Infrastructures Track 6: Infrastructures

Speaker

Robert Quick (Indiana University)

Description

The Open Science Grid (OSG) relies upon the network as a critical part of the distributed infrastructures it enables. In 2012 OSG added a new focus area in networking with a goal of becoming the primary source of network information for its members and collaborators. This includes gathering, organizing and providing network metrics to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing.

In September of 2015 this service was deployed into the OSG production environment. We will report on the creation, implementation, testing and deployment of the OSG Networking Service. Starting from organizing the deployment of perfSONAR toolkits within OSG and its partners, to the challenges of orchestrating regular testing between sites, to reliably gathering the resulting network metrics and making them available for users, virtual organizations and higher level services all aspects of implementation will be reviewed. In particular, several higher level services were developed to bring the OSG network service to its full potential. These include a web-based mesh configuration system, which allows central scheduling and management all the network tests performed by the instances, a set of probes to continually gather metrics from the remote instances and publish it to different sources, a central network datastore (Esmond), which provides interfaces to access the network monitoring information in close to real time and historically (up to a year) giving the state of the tests and the perfSONAR infrastructure monitoring, ensuring the current perfSONAR instances are correctly configured and operating as intended.

We will also describe the challenges we encountered in ongoing operations for the network service and how we have evolved our procedures to address those challenges. Finally we will describe our plans for future extensions and improvements to the service.

Primary Keyword (Mandatory) Collaborative tools
Secondary Keyword (Optional) Network systems and solutions
Tertiary Keyword (Optional) Monitoring

Primary author

Robert Quick (Indiana University)

Co-authors

Christopher Pipes (Indiana University) Edgar Fajardo Hernandez (Univ. of California San Diego (US)) Scott Werner Teige (Indiana University (US)) Shawn Mc Kee (University of Michigan (US)) Mr Soichi Hayashi (Indiana University) Mr Thomas Lee (Indiana University)

Presentation materials