SciQL: array data processing inside an RDBMS

Year: 
2013
Publication Date: 
Monday, 17 June, 2013
Published in: 
The 2013 ACM SIGMOD/PODS Conference
Authors: 
Ying Zhang, Martin L. Kersten, Stefan Manegold
Abstract: 

Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we have introduced SciQL (pronounced as 'cycle'). SciQL is the first SQL-based declarative query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence- interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., group array elements based on their positions. This leads to a generalisation of window-based query processing with wide applicability in science domains.

In this demo, we showcase a proof of concept implementation of SciQL in the relational database system MonetDB. First, with the Conway's Game of Life application implemented purely in SciQL queries, we demonstrate the storage of arrays in the MonetDB as first class citizens, and the execution of a comprehensive set of basic operations on arrays. Then, to show the usefulness of SciQL for real-world array data processing use cases, we demonstrate how various common image processing and remote sensing operations are executed as SciQL queries. The audience is invited to challenge SciQL with their use cases.

AttachmentSize
PDF icon p1049-zhang.pdf1.88 MB