DuckDB has become well-known as a lightweight, portable, and fast OLAP database.
While it excels as an embedded engine, could we push its boundaries further?
Could we build an actual data platform centered around DuckDB?
This is the idea behind Duckhouse:
Check the full article here
uv sync
uv run iceberg_over_flight.py serve -w warehouse -p 8816
curl https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet -o /tmp/yellow_tripdata_2023-01.parquet
uv run ingestion/ingestion.py
cd dbt_xorq_project
export PYTHONPATH="$PWD:$PYTHONPATH"
dbt run
- Reading and writing Iceberg tables with Flight Server
-
dbt run
using Flight plugin - Filtering and column selection