COLUMBIA UNIVERSITY DSI W4121

Primer on classic database execution

Query Execution

Query compilation into plans

Query Optimization

Traditional optimizations

One size doesn’t fit all

Column Stores (OLAP)

Motivation

Overview

Physical layout

Why not implement in a rowstore

Isn’t storing all these projections blowing up disk costs by several X?

Query execution walk through

    select avg(price)
    from data
    where symbol = 'GM' AND date = xxx

Highlights:

Query Compilation

How is a SQL query executed?

Why?

Spark blog post