Featured image of post DuckDB Delta + Unity Catalog Complete Guide: Writing, Time Travel & Unified Governance

DuckDB Delta + Unity Catalog Complete Guide: Writing, Time Travel & Unified Governance

DuckDB's Delta extension now supports INSERT writes, time travel queries, and Unity Catalog integration. This guide covers everything with runnable code examples and monetization advice.

DuckDB Delta + Unity Catalog Complete Guide: Writing, Time Travel & Unified Governance

DuckDB Delta + Unity Catalog Architecture

Key Takeaway: DuckDB’s Delta and Unity Catalog extensions have officially graduated from experimental status. You can now perform INSERT writes, time travel queries, and unified governance across engines — all within DuckDB.

Why Should You Care About This Update?

In the data engineering world, “read-only analysis” has always been DuckDB’s biggest limitation. Traditionally, DuckDB excelled at rapidly scanning and analyzing Parquet, CSV, and JSON files — but when it came to data modifications (inserting new records, updating existing rows, deleting stale data), you had to switch to a different system entirely.

In May 2026, the DuckDB team announced two major milestones:

  1. Delta extension fully matured: Upgraded from experimental to production-ready, supporting INSERT, time travel, and bulk operations.
  2. Unity Catalog extension also matured: Providing unified data governance for Delta tables across engines.

The implication? You can now complete the entire data pipeline — write, version, govern, and analyze — within a single DuckDB session.

Delta Writes: From “Read-Only Analytics” to “Writable Data Lake”

Basic INSERT Operations

Assuming you already have a Delta table, you can insert data directly:

-- Attach the Delta table
ATTACH './path/to/my_delta_table' AS my_table (TYPE delta);

-- Single-row insert
INSERT INTO my_table
VALUES ('Question 2', 2), ('The Answer', 42);

-- Bulk insert from a query
INSERT INTO my_table
FROM (SELECT text || ' (copy)', code + 100 FROM my_table);

Transactional Writes

Multiple INSERT operations within a BEGIN/COMMIT block are merged into a single Delta version, ensuring atomicity:

BEGIN;

INSERT INTO my_table VALUES ('Alice', 30);
INSERT INTO my_table VALUES ('Bob', 25);
INSERT INTO my_table VALUES ('Charlie', 35);

COMMIT;
-- All three INSERTs are combined into one Delta commit

Comparison with Iceberg Write Capabilities

FeatureDelta ExtensionIceberg Extension
INSERT Support✅ Mature✅ Mature
UPDATE Support❌ Planned✅ Supported
DELETE Support❌ Planned✅ Supported
MERGE INTO❌ Planned✅ Supported
Time Travel✅ Supported✅ Supported
Bulk Write Optimization✅ Filter Pushdown✅ Filter Pushdown

Note: The Delta extension currently only supports INSERT. UPDATE/MERGE/DELETE are on the roadmap. If you need full CRUD capabilities, consider the Iceberg extension.

Time Travel: Query Any Historical Version

Delta’s time travel feature lets you query any historical version of a table. This is invaluable for auditing, data recovery, and version comparison scenarios.

Inline Version Queries

-- Attach the table normally
ATTACH './path/to/my_table' AS my_table (TYPE delta);

-- Query version 0 (initial state)
SELECT count() FROM my_table AT (VERSION => 0);
-- Returns 1 row (only Question 1)

-- Query version 1 (after first insert)
SELECT count() FROM my_table AT (VERSION => 1);
-- Returns 3 rows

-- Query the latest version
SELECT count() FROM my_table;
-- Returns 6 rows

Pinned Version Attachment

When you need a stable reference point, lock the version at attachment time:

-- Always points to version 1, unaffected by future writes
ATTACH './path/to/my_table' AS my_table_v1
    (TYPE delta, VERSION 1);

SELECT count() FROM my_table_v1;
-- Always returns 3

-- Pin to the snapshot at attach time
ATTACH './path/to/my_table' AS my_table_pinned
    (TYPE delta, PIN_SNAPSHOT);

SELECT count() FROM my_table_pinned;
-- Locks the latest version at attach time

Real-World Scenario: Data Audit Trail

-- Assume your transactions table receives incremental data daily
-- For month-end audits, you can precisely backtrack to any date's state

ATTACH './transactions' AS txn (TYPE delta);

-- Compare beginning-of-month vs. current totals
SELECT 
    (SELECT SUM(amount) FROM txn AT (VERSION => 100)) AS month_start_total,
    (SELECT SUM(amount) FROM txn) AS current_total;

Unity Catalog: Cross-Engine Unified Data Governance

What Is Unity Catalog?

Unity Catalog is an open standard for managing data assets (tables, volumes, models, functions) in a data lake. It transforms your data lake into a Lakehouse architecture, providing:

  • Unified Discovery: Find data assets across all engines from one place
  • Centralized Access Control: Fine-grained permission management
  • Audit Trail: Track who accessed what data and when
  • Cross-Engine Collaboration: DuckDB, Spark, Trino share the same metadata

Quick Start: OSS Unity Catalog

First, you need to spin up a Unity Catalog server. The official approach uses Docker:

# Clone the playground repo
git clone https://github.com/benfleis/duckdb-unitycatalog-playground.git
cd duckdb-unitycatalog-playground

# Build and start (auto-creates schema and test tables)
docker build -t duckdb-uc-playground .
docker run -p 8080:8080 duckdb-uc-playground

Connecting to Unity Catalog from DuckDB

-- Load the Unity Catalog extension
LOAD unity_catalog;

-- Create a Secret for the Unity Catalog server connection
CREATE SECRET (
    TYPE     unity_catalog,
    TOKEN    'demo-ignored-token',
    ENDPOINT 'http://localhost:8080'
);

-- Attach as a catalog in DuckDB
ATTACH 'unity' AS my_catalog
    (TYPE unity_catalog, DEFAULT_SCHEMA 'my_schema');

-- Query tables managed by Unity Catalog directly
SELECT name, age, adopted 
FROM my_catalog.pets 
ORDER BY name;

Databricks Unity Catalog Integration

If you’re using the managed Databricks version, the setup is similar:

CREATE SECRET (
    TYPE     unity_catalog,
    TOKEN    'databricks-pat-token',  -- Personal Access Token
    ENDPOINT 'https://your-workspace.cloud.databricks.com/'
);

ATTACH 'unity' AS dbx_catalog
    (TYPE unity_catalog, DEFAULT_SCHEMA 'analytics');

SELECT * FROM dbx_catalog.default.sales_data;

Complete Workflow: From Data Writing to Governed Analytics

Here’s a full example showing how to integrate Delta writes, time travel, and Unity Catalog:

-- ============================================
-- Step 1: Write data via Delta extension
-- ============================================
ATTACH './customer_data' AS customers (TYPE delta);

INSERT INTO customers VALUES ('C001', 'Zhang San', 25, TRUE);
INSERT INTO customers VALUES ('C002', 'Li Si', 30, FALSE);
INSERT INTO customers VALUES ('C003', 'Wang Wu', 28, TRUE);

-- ============================================
-- Step 2: Register governance via Unity Catalog
-- ============================================
LOAD unity_catalog;

CREATE SECRET (
    TYPE     unity_catalog,
    TOKEN    'demo-token',
    ENDPOINT 'http://localhost:8080'
);

ATTACH 'unity' AS uc (TYPE unity_catalog, DEFAULT_SCHEMA 'analytics');

-- ============================================
-- Step 3: Federated analysis across Delta + UC
-- ============================================
-- Query historical version of Delta table
SELECT count() FROM customers AT (VERSION => 0);

-- Execute permission-aware queries via UC
SELECT * FROM uc.analytics.customer_profiles;

Comparison with Traditional Solutions

DimensionDuckDB + Delta + UCSpark + Delta + UCTraditional (PostgreSQL)
Deployment Complexity⭐ Embedded, zero deps⭐⭐ Requires cluster⭐⭐ Requires setup
Write Performance⭐⭐ Medium scale OK⭐⭐⭐ Large-scale clusters⭐⭐ Limited on single node
Analytical Performance⭐⭐⭐ Columnar optimized⭐⭐ JVM overhead⭐ Row-store limitations
Time Travel✅ Native✅ Native❌ Needs extra tools
Data Governance✅ Unity Catalog✅ Unity Catalog❌ Self-built required
Learning Curve⭐ Low, SQL only⭐⭐⭐ High⭐⭐ Moderate
Best Use CaseSmall-medium analyticsLarge-scale processingOLTP workloads

Monetization Advice

1. Data Platform Productization

Using the DuckDB + Delta + Unity Catalog combination, you can build a lightweight data platform product:

  • Target Customers: SMEs, startups
  • Core Value: No Spark cluster needed — single-node data lake management and analytics
  • Pricing: Basic ¥2,000/month, Enterprise ¥8,000/month

2. Data Audit & Compliance Services

Time travel + Unity Catalog’s audit trail capability is perfect for regulated industries:

  • Target Customers: Finance, healthcare, e-commerce
  • Core Value: Precise data state rollback to any point in time for compliance audits
  • Delivery Format: Compliance toolkit + customized reporting

3. Real-Time Data Pipelines

Although Delta currently only supports INSERT, it’s sufficient for incremental data pipelines:

  • Target Customers: Content platforms, IoT data collection
  • Core Value: Low-cost data collection + version management solution
  • Tech Stack: DuckDB CLI + crontab + Delta writes

4. Consulting & Training

As Delta and Unity Catalog mature, market demand will grow rapidly:

  • Training Content: Lakehouse architecture design, Delta best practices, Unity Catalog permission models
  • Market Position: Help traditional enterprises migrate from data warehouses to data lakes
  • Revenue Streams: Corporate training ¥50,000+/day, consulting projects ¥200,000+/project

Summary

DuckDB’s Delta and Unity Catalog extensions reaching maturity marks a significant milestone — DuckDB is evolving from a pure “analytics engine” into a fully writable Lakehouse platform. For data engineers, this means:

  1. Write Capability: Incremental data pipelines via INSERT support
  2. Version Management: Time travel for data auditing and rollback
  3. Unified Governance: Unity Catalog for cross-engine data discovery and access control

While UPDATE/MERGE/DELETE are not yet available, the INSERT + time travel + Unity Catalog combination already covers approximately 80% of data lake use cases.


Based on DuckDB official blog post Delta Grows Up: Writes, Unity Catalog and Time Travel.

📺 Watch video tutorials → Olap Studio YouTube

Subscribe for more DuckDB & AI automation tutorials

Built with Hugo
Theme Stack designed by Jimmy

⚠️ This site is an independent community project, not affiliated with, endorsed by, or sponsored by the DuckDB Foundation or official DuckDB project.

"DuckDB" is a registered trademark of the DuckDB Foundation. This site uses the name solely for factual description purposes.

All content is for educational and community promotion purposes only and does not constitute any commercial service.