Featured image of post DuckDB CONNECT URI Syntax: Ephemeral Attach Makes Cross-Database Queries Elegant

DuckDB CONNECT URI Syntax: Ephemeral Attach Makes Cross-Database Queries Elegant

DuckDB introduces CONNECT 'URI' syntax for ephemeral database attachments, simplifying cross-database queries without manual ATTACH/DISCONNECT management. Learn usage, scenarios, and best practices.

DuckDB Connect URI Architecture

Introduction

In daily data analysis work, we often need to retrieve data from multiple sources — business tables in PostgreSQL, log files in Parquet format, remote databases via the Quack protocol, or even object storage on S3. DuckDB has always connected to these external data sources through the ATTACH statement, but the traditional approach has a pain point: you need to explicitly attach the database, execute queries, and then explicitly disconnect.

With the merge of PR #23555, DuckDB introduces a new syntax sugar — CONNECT 'URI' — making cross-database queries more elegant and concise.

Traditional Approach vs New Syntax

The Traditional ATTACH/DISCONNECT Pattern

In the past, to connect to a Quack remote database and query it, you would write:

-- Attach the remote database
ATTACH 'quack:localhost' AS remote_db (TOKEN 'my-secret-token');

-- Switch to the remote database context
CONNECT remote_db;

-- Execute queries
SELECT * FROM orders WHERE amount > 1000;

-- Switch back to the main database
CONNECT main;

-- Finally, remember to disconnect
DETACH remote_db;

While this code is fully functional, it’s verbose. Each connection requires four to five steps, and it’s easy to forget the final DETACH, leading to connection leaks.

The New CONNECT ‘URI’ Syntax

Now, you can accomplish the same thing with a single statement:

-- One-time connect, query, disconnect
CONNECT 'quack:localhost' (TOKEN 'my-secret-token');

-- Execute queries (automatically in the remote database context)
SELECT * FROM orders WHERE amount > 1000;

-- Done
DISCONNECT;

Or even further, if you only need to execute a single query and disconnect:

-- Connect, query immediately, disconnect
CONNECT 'quack:localhost' (TOKEN 'my-secret-token');
SELECT COUNT(*) FROM users WHERE active = true;
DISCONNECT;

The key insight is that CONNECT 'URI' is ephemeral — executing DISCONNECT not only closes the connection but also automatically cleans up all resources, including the context state that might be left behind by manual ATTACH.

How It Works Under the Hood

The CONNECT 'URI' syntax is internally equivalent to the following two-step operation:

CONNECT 'quack:localhost' (TOKEN 'qwerty')
-- Equivalent to
ATTACH 'quack:localhost' AS <auto_name> (TOKEN 'qwerty');
CONNECT <auto_name>;

And DISCONNECT is equivalent to:

DISCONNECT
-- Equivalent to
DETACH <auto_name>;

This design allows DuckDB to provide a cleaner interface for users while maintaining backward compatibility. For developers, this means:

  1. No need to manage database aliases: The system automatically generates temporary names
  2. Automatic cleanup: DISCONNECT performs DETACH simultaneously, preventing connection leaks
  3. Transparent context switching: After CONNECT, you’re automatically switched to the target database

Supported Database Types

Currently, CONNECT 'URI' primarily supports the following two protocols:

ProtocolURI FormatUse Case
Quackquack:host:portDuckDB remote protocol, cross-instance queries
PostgreSQLpostgres://host/dbDirect PostgreSQL database connection

Quack Protocol Example

-- Connect to a remote Quack service
CONNECT 'quack:***@db.example.com/analytics' ();

-- Query a PostgreSQL table
SELECT 
    p.product_name,
    SUM(o.quantity) AS total_sold
FROM products p
JOIN orders o ON p.id = o.product_id
GROUP BY p.product_name
ORDER BY total_sold DESC
LIMIT 10;

-- Disconnect
DISCONNECT;

Comparison with Traditional ATTACH

FeatureATTACH + CONNECTCONNECT 'URI'
Lines of code4-5 lines2-3 lines
Manage aliases neededYesNo
Automatic resource cleanupNo (requires manual DETACH)Yes (DISCONNECT auto-cleans)
Connection leak riskHighLow
ReadabilityAverageExcellent
Backward compatibilityFully compatibleNew syntax
Best use caseLong-term multi-db sessionsShort-lived single queries

Real-World Application Scenarios

Scenario 1: Temporary Data Export

When you need to export data from a remote database for local analysis:

-- Connect to the remote database
CONNECT 'quack:remote-server:5432' (TOKEN 'export-token');

-- Export data to Parquet
COPY (SELECT * FROM transactions WHERE year = 2025) 
TO '/tmp/transactions_2025.parquet' (FORMAT PARQUET);

-- Disconnect
DISCONNECT;

-- Local analysis
SELECT 
    region,
    AVG(amount) AS avg_transaction,
    MAX(amount) AS max_transaction
FROM read_parquet('/tmp/transactions_2025.parquet')
GROUP BY region;

Scenario 2: Cross-Database Join Queries

-- Connect to external data source
CONNECT 'quack:inventory-server:5432' (TOKEN 'inv-token');

-- Join query (external data + local data)
SELECT 
    p.product_name,
    i.stock_quantity,
    s.recent_sales
FROM products p
LEFT JOIN (
    SELECT product_id, SUM(quantity) AS recent_sales
    FROM local_sales
    GROUP BY product_id
) s ON p.id = s.product_id
LEFT JOIN inventory i ON p.id = i.product_id
WHERE i.stock_quantity < 10;

DISCONNECT;

Scenario 3: Automated Report Generation

import duckdb

conn = duckdb.connect(':memory:')

# One-time connection to remote database for data retrieval
result = conn.execute("""
    CONNECT 'quack:report-server:5432' (TOKEN 'report-token');
    SELECT * FROM daily_metrics WHERE date = CURRENT_DATE - INTERVAL '1' DAY;
    DISCONNECT;
""").fetchdf()

# Local processing
print(result.describe())

Performance Considerations

The performance of CONNECT 'URI' is essentially the same as traditional ATTACH, since they execute the same underlying operations. However, here are some optimization suggestions:

  1. Avoid frequent connect/disconnect cycles: If you need to query the same remote database multiple times, use the traditional ATTACH approach to reuse the connection for better efficiency.

  2. Connection timeouts: Establishing a remote connection takes network time. Set reasonable timeout values:

CONNECT 'quack:server:5432' (
    TOKEN 'my-token',
    TIMEOUT 30
);
  1. Batch operations first: If you need to perform extensive operations on remote data, consider copying it locally first:
CONNECT 'quack:big-data:5432' (TOKEN 'bulk-token');
COPY (SELECT * FROM huge_table) TO '/tmp/huge_table.parquet';
DISCONNECT;

-- Ultra-fast local analysis
SELECT * FROM read_parquet('/tmp/huge_table.parquet') WHERE ...;

Comparison with Other Database Tools

ToolCross-Database Query MethodEphemeral ConnectAuto Cleanup
DuckDB (New)CONNECT 'URI'
DuckDB (Old)ATTACH + CONNECT
PostgreSQL dblinkdblink_connect()
MySQL FEDERATEDTable-level
Spark SQLspark.read.jdbc()
TrinoCREATE TABLE ... AS

DuckDB’s CONNECT 'URI' syntax maintains SQL standard compatibility while offering a more streamlined ephemeral connection experience compared to traditional tools.

Monetization Ideas

Mastering DuckDB’s new connection syntax opens up several business opportunities:

  1. Data Integration Consulting ($300-800/project): Help enterprises build multi-source data integration solutions based on DuckDB, leveraging CONNECT 'URI' for rapid connection to various databases.

  2. Automated Reporting Products ($50-200/month subscription): Develop daily/weekly report auto-generation tools based on DuckDB that connect multiple data sources and aggregate outputs — ideal for small and medium businesses.

  3. Data Pipeline Template Library ($20-100/template): Encapsulate common CONNECT + COPY + analysis patterns into reusable templates and sell on tech communities or knowledge-sharing platforms.

  4. Enterprise Training ($500-1,500/day): Conduct advanced DuckDB training for data teams, focusing on real-world applications of the new connection syntax in business scenarios.

  5. SaaS Analytics Platform: Build a lightweight BI tool based on DuckDB, using CONNECT 'URI' to implement multi-tenant data isolation and cross-database analysis.

Summary

The CONNECT 'URI' syntax is an important usability improvement for DuckDB. It condenses what used to be a multi-step connection process into a single statement, while the automatic cleanup mechanism prevents connection leaks. For data analysts and engineers who frequently perform cross-database queries, this new feature can significantly boost productivity.

As the DuckDB ecosystem continues to evolve, we can expect more SQL standard compatibility improvements and user experience enhancements to follow.


References

📺 Watch video tutorials → Olap Studio YouTube

Subscribe for more DuckDB & AI automation tutorials

Built with Hugo
Theme Stack designed by Jimmy

⚠️ This site is an independent community project, not affiliated with, endorsed by, or sponsored by the DuckDB Foundation or official DuckDB project.

"DuckDB" is a registered trademark of the DuckDB Foundation. This site uses the name solely for factual description purposes.

All content is for educational and community promotion purposes only and does not constitute any commercial service.