Accelerating Raw Data Analytics (to be rescheduled)

Name: Accelerating Raw Data Analytics (to be rescheduled)
Start: 2019-11-11T17:30:00+01:00
End: 2019-11-11T18:35:00+01:00
Location: CERN

Monday 11 Nov 2019, 17:30 → 18:35 Europe/Zurich

40/R-B10 (CERN)

40/R-B10

CERN

Show room on map

Description

Driven by a rapid increase in quantity, type, and number of sources of data, many data analytics approaches use the data lake model. Data pieces in the lake vary in size, encoding, age, quality, etc. Additionally, it often lacks consistency or any common schema. Analytics applications pull data from the lake as they see fit, digesting and processing it on-demand for a current analytics task. The applications of data lakes are vast, including machine learning, data mining, artificial intelligence, ad-hoc query, and data visualization. However, the bottleneck of the data transformation required by traditional analytical systems poses great challenges to the fast processing of raw data which is critical for many of the aforementioned applications.

In the presentation, we will discuss how ACCORDA addresses the data transformation bottleneck by applying accelerations, and cover how ACCORDA avoids disruptions in existing analytic software through a uniform worker model enabled by the in-memory integration of our small but highly efficient unstructured data processor, an application-specific instruction-set processor(ASIP). We will also briefly cover how the insights on accelerating analytics could apply in scientific computing, especially in analyses for high-energy physics.

- 17:30 → 18:00
  
  Accelerating Raw Data Analytics 30m
  
  Speaker: Chen Zou (University of Chicago)