28–29 May 2013
CERN
Europe/Zurich timezone

SciQL: Array Data Processing Inside an RDBMS

28 May 2013, 16:20
20m
60/6-015 - Room Georges Charpak (Room F) (CERN)

60/6-015 - Room Georges Charpak (Room F)

CERN

90
Show room on map

Speaker

Ying Zhang (C)

Description

Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we have introduced SciQL (pronounced as ‘cycle’) in [15]. SciQL is the first SQL-based declarative query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence- interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., group array elements based on their positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. In this demo, we showcase a proof of concept implementation of SciQL in the relational database system MonetDB. First, with the Conway’s Game of Life application implemented purely in SciQL queries, we demonstrate the storage of arrays in the MonetDB as first class citizens, and the execution of a comprehensive set of basic operations on arrays. Then, to show the usefulness of SciQL for real-world array data processing use cases, we demonstrate how various common image processing and remote sensing operations are executed as SciQL queries. The audience is invited to challenge SciQL with their use cases.

Author

Ying Zhang (C)

Co-authors

Prof. Martin Kersten (Centrum Wiskunde & Informatica) Dr Stefan Manegold (Centrum Wiskunde & Informatica)

Presentation materials