27 September 2004 to 1 October 2004
Interlaken, Switzerland
Interactive Data Analysis on the Grid using Globus 3 and JAS3

30 Sep 2004, 17:30
T. Johnson (SLAC)


The aim of the service is to allow fully distributed analysis of large volumes of data while maintaining true (sub-second) interactivity. All the Grid related components are based on OGSA style Grid services, and to the maximum extent uses existing Globus Toolkit 3.0 (GT3) services. All transactions are authenticated and authorized using GSI (Grid Security Infrastructure) mechanism - part of GT3. JAS3, and experiment independent data analysis tool is used as the interactive analysis client. The system consists of three main service components: Dataset Catalog Service: The Dataset Catalog supports browsing for an interesting dataset, or searching for data using a query language which operates on metadata stored in the catalog. The catalog makes few assumptions about the metadata stored in the catalog, except that the metadata consists of key-value pairs, stored in a hierarchical tree. The Dataset Catalog Service is designed to allow easy interfacing to existing data catalog back-ends. Dataset Analysis Grid Service: This service is responsible for resolving the dataset id from the catalog service, and transferring chunks of data to worker nodes for analysis processing. This service also manages the worker nodes, distributes analysis code to the worked nodes and retrieves intermediate results from the worker nodes before sending merged results back to the analysis client. Worker Execution Services: This service runs on each worker node and is responsible for processing analysis requests. In this presentation we will demonstrate the current system, and will describe some of the choices made in architecting the system, in particular the challenges of obtaining interactive response times from GT3.

Primary authors

B. Ananthan (Tech-X corporation) D. Alexander (Tech-X corporation) M. Turri (SLAC) T. Johnson (SLAC) V. Serbo (SLAC)

