DuckDB for Elixir applications, over DuckDB's experimental Quack protocol.
QuackDB gives Elixir applications an OTP-supervised DuckDB process, a DBConnection client, an Ecto adapter and query DSL for analytical DuckDB workflows, native append APIs, Explorer dataframe writes, Geo/WKB spatial integration, Table.Reader results, telemetry, and a managed DuckDB binary installer.
Warning
QuackDB targets DuckDB's experimental Quack protocol and is not production-ready yet. Public APIs, result shapes, Ecto adapter behavior, and supported protocol coverage may still change as DuckDB and QuackDB evolve. Validate behavior against your DuckDB version before relying on it for critical workloads.
defmodule MyApp.Analytics do
use QuackDB.Ecto
alias QuackDB.Source
def category_latency do
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
events: count()
}
end
endDuckDB is already excellent at analytical SQL. QuackDB focuses on the Elixir side:
- run DuckDB as a supervised process during development, tests, notebooks, examples, or local apps;
- use DBConnection semantics for pooled sessions, transactions, streams, and query lifecycle;
- compose DuckDB analytical queries with Ecto instead of assembling SQL strings;
- use Elixir-native values such as
Duration,%Geo.*{},Date.Range, maps, lists, structs, and Explorer dataframes where possible; - append rows, columns, and Explorer dataframes through DuckDB's native append path;
- expose results to Livebook and dataframe tooling through
Table.Reader; - keep raw SQL available when DuckDB-specific syntax is clearer or not represented by Ecto.
| Elixir layer | QuackDB integration |
|---|---|
| OTP | supervised local DuckDB server, managed binary, restartable child specs |
| DBConnection | pooled Quack sessions, queries, streams, transactions |
| Ecto | adapter, query DSL, analytical helpers, migrations, writes |
| Explorer | dataframe append and dataframe-friendly results |
| Geo | %Geo.*{} params and WKB/GeoJSON workflows |
| Table.Reader | Livebook/dataframe-friendly result consumption |
| Telemetry | query, append, and fetch spans |
| Mix | quackdb.install task for managed DuckDB binaries |
Add QuackDB to your dependencies:
def deps do
[
{:quackdb, "~> 0.3.0"}
]
endOptional integrations are enabled when their packages are present:
def deps do
[
{:quackdb, "~> 0.3.0"},
{:ecto_sql, "~> 3.13"},
{:explorer, "~> 0.11"},
{:geo, "~> 4.1"}
]
endDuckDB's Quack protocol is experimental. For local development, use DuckDB 1.5.3 or newer with the quack extension.
For development, tests, examples, and notebooks, QuackDB can supervise DuckDB's CLI process and start a matching client pool. child_specs/1 generates one random token and injects it into both children.
children =
QuackDB.Server.child_specs(
server: [
name: MyApp.DuckDB,
duckdb: :managed,
endpoint: "quack:localhost:9494"
],
client: [
name: MyApp.QuackDB,
pool_size: System.schedulers_online()
]
)duckdb: :managed downloads DuckDB's official CLI binary on first use, verifies known checksums for QuackDB's pinned DuckDB version, and caches it. QuackDB never downloads DuckDB during dependency compilation. Use QUACKDB_BINARY_PATH, QUACKDB_BINARY_CACHE_DIR, duckdb: "/path/to/duckdb", or run the quackdb.install Mix task when you want explicit control. See the managed DuckDB guide.
You can also start DuckDB manually:
duckdb -interactive -init /dev/null \
-cmd "LOAD quack; CALL quack_serve('quack:localhost', token='super_secret');"quack:localhost often binds on IPv6 localhost, so examples use http://[::1]:9494.
QuackDB can be used directly through its DBConnection-backed client.
{:ok, conn} =
QuackDB.start_link(
uri: "http://[::1]:9494",
token: "super_secret"
)
{:ok, result} = QuackDB.query(conn, "SELECT ? AS name, ? AS n", ["duck", 42])
result.columns
#=> ["name", "n"]
result.rows
#=> [["duck", 42]]Use QuackDB.stream/4 for large result sets, or QuackDB.columns/4 when a column-oriented shape is more convenient for analytics tooling.
QuackDB exposes common DuckDB analytical workflows as Ecto-compatible helpers so they compose with normal queries.
defmodule MyApp.Analytics do
use QuackDB.Ecto
def category_scores do
from event in "events",
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
precise_sum: fsum(event.duration_ms),
mode: mode(event.duration_ms),
weighted_average: weighted_avg(event.duration_ms, event.weight),
values: list(event.duration_ms, order_by: [desc_nulls_last: event.duration_ms]),
slow_events: filter(count(event.id), event.duration_ms > 1_000),
distinct_users: count(event.user_id, :distinct),
average_duration: coalesce(avg(event.duration_ms), 0),
events: count()
}
end
enduse QuackDB.Ecto
from day in series(Date.range(~D[2024-01-01], ~D[2024-01-31])),
left_join: event in "events",
on: event.occurred_on == day.value,
group_by: day.value,
order_by: day.value,
select: %{
day: day.value,
events: count(event.id)
}Timestamp series use Duration steps:
from bucket in series(
~N[2024-01-01 00:00:00],
~N[2024-01-02 00:00:00],
step: Duration.new!(hour: 1)
),
select: bucket.valueDuckDB can query data where it already lives. QuackDB source helpers can be used directly as Ecto sources.
use QuackDB.Ecto
alias QuackDB.Source
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
events: count(),
avg_score: avg(event.score)
}QuackDB does not upload local files for you. The DuckDB server must be able to see the path, URL, object store, or lakehouse catalog. See the sources guide.
External data can be materialized with CREATE TABLE AS, indexed with DuckDB FTS, and queried with BM25 from Ecto.
use QuackDB.Ecto
alias QuackDB.{DDL, FTS, Source}
query =
from doc in Source.parquet("s3://bucket/docs/*.parquet"),
select: %{
id: doc.id,
title: doc.title,
body: doc.body
}
MyApp.AnalyticsRepo.query!(DDL.create_table("docs", as: query, temporary: true))
MyApp.AnalyticsRepo.query!(FTS.create_index("docs", :id, [:title, :body], overwrite: true))
schema = FTS.schema_name("main.docs")
search = "duckdb analytics"
from doc in "docs",
where: bm25(^schema, doc.id, ^search) > 0,
order_by: [desc: bm25(^schema, doc.id, ^search)],
limit: 10,
select: %{
id: doc.id,
title: doc.title,
score: bm25(^schema, doc.id, ^search)
}See the full-text search guide.
DuckDB text and RE2 regular-expression helpers compose with Ecto filters and aggregate FILTER clauses. Shared contains/2 dispatches obvious string calls to DuckDB contains and spatial expressions to ST_Contains; contains_text/2 and st_contains/2 are available when you want to be explicit.
use QuackDB.Ecto
from event in "events",
where: contains(event.name, "duck") and regexp_matches(event.name, ~r/^duck/i),
select: %{
slug: regexp_replace(event.name, ~r/\s+/, "-", "g"),
parts: string_split(event.tags, ",")
}DuckDB regexes use RE2, so ~r literals are intended for the syntax subset shared with Elixir regexes.
DuckDB Spatial works with Ecto queries and %Geo.*{} structs when the optional :geo package is installed.
use QuackDB.Ecto
import QuackDB.Ecto.Spatial
alias QuackDB.Spatial
MyApp.AnalyticsRepo.query!(Spatial.load())
point = %Geo.Point{coordinates: {13.405, 52.52}, srid: nil}
from place in "places",
where: intersects(place.geom, ^point) and distance(place.geom, ^point) < 1_000,
select: %{
id: place.id,
name: place.name,
wkt: as_text(place.geom)
}GEOMETRY values decode as WKB-compatible bytes for tested DuckDB Spatial values. QuackDB.Geometry can convert to/from %Geo.*{} structs when the optional :geo package is installed. See the spatial guide and the Spatial WMS example.
QuackDB can write through DuckDB's native append protocol instead of generating huge INSERT VALUES statements.
QuackDB.insert_rows!(conn, "events", [
[id: 1, name: "duck", tags: ["bird", "wetland"]],
[id: 2, name: "goose", tags: ["bird", "loud"]]
])
QuackDB.insert_columns!(conn, "measurements", [
id: [1, 2, 3],
temperature: [12.5, 13.0, 12.8]
])When Explorer is installed, dataframes can be appended directly:
alias Explorer.DataFrame
alias QuackDB.Explorer, as: QuackExplorer
frame = DataFrame.new(id: [1, 2], name: ["duck", "goose"])
QuackExplorer.insert_dataframe!(conn, "events", frame)Enumerable rows can be streamed into native append batches:
File.stream!("events.ndjson")
|> Stream.map(&Jason.decode!/1)
|> QuackDB.insert_stream!(conn, "events", chunk_every: 10_000)Any Table.Reader-compatible data can be appended through the same column append path:
QuackDB.insert_table!(conn, "events", %{id: [1, 2], name: ["duck", "goose"]})Append supports explicit types, batching, scalar DuckDB values, and nested LIST, STRUCT, ARRAY, and MAP values. See the Explorer guide.
QuackDB.Result and QuackDB.Columns implement Table.Reader, so they can be consumed by Livebook and other Table-aware tooling. When Explorer is installed, query results can be turned into dataframes:
result = QuackDB.query!(conn, "SELECT * FROM events")
Explorer.DataFrame.new(result)QuackDB emits telemetry spans for query, append, and fetch operations:
[:quackdb, :query, :start | :stop][:quackdb, :append, :start | :stop][:quackdb, :fetch, :start | :stop]
Metadata includes connection/session information, command details, append batch counts, and client query IDs. Params are not included unless you opt in with telemetry_params: true. See the telemetry guide.
QuackDB includes an optional Ecto SQL adapter for applications that want Ecto query composition, schema-based reads/writes, migrations, and raw SQL through Repo.query/3.
defmodule MyApp.AnalyticsRepo do
use Ecto.Repo,
otp_app: :my_app,
adapter: Ecto.Adapters.QuackDB
endThe adapter currently covers:
- raw SQL via
Repo.query/3; - schema-backed full selects and
Repo.get!/2; - analytical reads with joins, filters, grouping, windows, CTEs, combinations, locks, fragments, and QuackDB helper macros;
Repo.insert/2,Repo.insert_all/3,RETURNING,ON CONFLICT DO NOTHING, and commonDO UPDATEupserts;- explicit native append fast path via
insert_method: :append; update_all,delete_all, schemaupdate/delete, and transaction usage;Ecto.Adapters.SQL.explain/4;- basic migration DDL through Ecto migrator: create/drop/alter tables, columns, references, indexes, primary keys, check constraints, and renames.
DuckDB-specific SQL that Ecto cannot model cleanly should still use Repo.query/3. See the Ecto coverage matrix.
The repository includes runnable scripts, a Livebook notebook, and a small WMS app:
examples/ecto_analytics.exs— Ecto analytical aggregates, text predicates, and DuckDBSUMMARIZEprofiling.examples/dataframe_analytics.exs— derive DDL from an Ecto schema, append an Explorer dataframe, query with Ecto DSL, and return a dataframe.examples/full_text_search.exs— materialize a source scan, build a DuckDB FTS index, and query BM25 search through direct helpers and Ecto.examples/spatial_wms/— an Ash + Ecto + Plug/Bandit app serving DuckDB Spatial rows through a WMS-like GeoJSON endpoint.examples/query_observability.exs— attach telemetry handlers and print query, append, and fetch timings.examples/append_benchmark.exs— compares SQL inserts, native row/column append, Explorer append, and Ecto insert paths.examples/livebook_analytics.livemd— interactive analytics with DuckDB SQL, Explorer, Table.Reader, VegaLite, and telemetry.
Run scripts from outside the Mix project so Mix.install/2 can load the local package:
cd /tmp
elixir /path/to/quackdb/examples/dataframe_analytics.exsQuackDB is intentionally focused on DuckDB analytics over Quack:
- the Quack wire protocol is experimental and may change;
- unsupported vector/logical types raise explicit protocol errors;
- Ecto coverage focuses on analytical workflows and common write/setup paths, not every adapter edge case;
- QuackDB does not stage/upload local files to a remote server;
- Arrow IPC / zero-copy columnar handoff is research for now;
- managed DuckDB binary downloads currently cover Linux/macOS targets, not Windows.
- Getting started
- Type support
- Examples
- Managed DuckDB
- Explorer
- Sources
- Spatial
- Full-text search
- Telemetry
- Protocol coverage
- Ecto coverage
mix deps.get
mix ciSee CONTRIBUTING.md for local checks, example smoke tests, package audit steps, and release dry-run notes.
Integration tests are skipped by default. To run them against a Quack server:
QUACKDB_TEST_URI='http://[::1]:9494' \
QUACKDB_TEST_TOKEN=super_secret \
mix test --include integration