Oct 10 – 14, 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

jade: An End-To-End Data Transfer and Catalog Tool

Oct 12, 2016, 12:00 PM
15m
GG C3 (San Francisco Mariott Marquis)

GG C3

San Francisco Mariott Marquis

Oral Track 4: Data Handling Track 4: Data Handling

Speaker

PATRICK MEADE (University of Wisconsin-Madison)

Description

The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the Geographic South Pole. IceCube collects 1 TB of data every day. An online filtering farm processes this data in real time and selects 10% to be sent via satellite to the main data center at the University of Wisconsin-Madison. IceCube has two year-round on-site operators. New operators are hired every year, due to the hard conditions of wintering at the South Pole. These operators are tasked with the daily operations of running a complex detector in serious isolation conditions. One of the systems they operate is the data archiving and transfer system. Due to these challenging operational conditions, the data archive and transfer system must above all be simple and robust. It must also share the limited resource of satellite bandwidth, and collect and preserve useful metadata. The original data archive and transfer software for IceCube was written in 2005. After running in production for several years, the decision was taken to fully rewrite it, in order to address a number of structural drawbacks. The new data archive and transfer software (JADE2) has been in production for several months providing improved performance and resiliency. One of the main goals for JADE2 is to provide a unified system that handles the IceCube data end-to-end: from collection at the South Pole, all the way to long-term archive and preservation in dedicated repositories at the North. In this contribution, we describe our experiences and lessons learned from developing and operating the data archive and transfer software for a particle physics experiment in extreme operational conditions like IceCube.

Primary Keyword (Mandatory) Data processing workflows and frameworks/pipelines
Secondary Keyword (Optional) Software development process and tools

Primary author

PATRICK MEADE (University of Wisconsin-Madison)

Presentation materials