Featured image of post DuckDB Java Integration Guide: JDBC, Maven/Gradle, and CRUD Examples

DuckDB Java Integration Guide: JDBC, Maven/Gradle, and CRUD Examples

Complete DuckDB Java integration tutorial covering Maven/Gradle setup, JDBC driver connection, CRUD operations, and DuckDB vs H2 comparison. Learn to use DuckDB efficiently in Java projects.

DuckDB Java Integration Overview

DuckDB provides a native JDBC driver that enables seamless integration with Java projects. Whether you’re doing data analysis, ETL processing, or building embedded analytics applications, DuckDB’s Java integration delivers an excellent development experience with minimal boilerplate code.

Key advantages of DuckDB Java:

  • Zero configuration: No database server setup, embed directly in Java applications
  • Standard JDBC: Fully compliant with JDBC 4.0 specification, minimal learning curve
  • Columnar storage: 10-100× faster than H2 and SQLite for analytical queries
  • Full SQL support: Window functions, CTEs, PIVOT, and more advanced analytics features

Step 1: Maven / Gradle Configuration

Maven Dependency

<dependency>
    <groupId>org.duckdb</groupId>
    <artifactId>duckdb_jdbc</artifactId>
    <version>1.2.0</version>
</dependency>

Gradle Dependency

implementation 'org.duckdb:duckdb_jdbc:1.2.0'

Verify Dependencies

# Maven
mvn dependency:tree | grep duckdb

# Gradle
gradle dependencies | grep duckdb

Step 2: JDBC Connection and Basic Operations

Establish a Connection

DuckDB Java supports two connection modes: in-memory and persistent file database.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class DuckDBConnect {
    public static void main(String[] args) throws Exception {
        // Mode 1: In-memory database (data not persisted)
        Connection inMemConn = DriverManager.getConnection("jdbc:duckdb:");

        // Mode 2: Persistent file database
        Connection fileConn = DriverManager.getConnection(
            "jdbc:duckdb:/path/to/mydb.duckdb"
        );

        System.out.println("DuckDB Java connected successfully!");
    }
}

Create Table and Insert Data

try (Connection conn = DriverManager.getConnection("jdbc:duckdb:");
     Statement stmt = conn.createStatement()) {

    // Create table
    stmt.execute("CREATE TABLE employees (" +
        "id INTEGER, " +
        "name VARCHAR, " +
        "department VARCHAR, " +
        "salary DECIMAL(10,2)" +
    ")");

    // Insert data
    stmt.executeUpdate("INSERT INTO employees VALUES " +
        "(1, 'Alice', 'Engineering', 120000), " +
        "(2, 'Bob', 'Marketing', 95000), " +
        "(3, 'Charlie', 'Engineering', 110000), " +
        "(4, 'Diana', 'Sales', 85000)");

    System.out.println("Data inserted successfully!");
}

Query Data

try (Connection conn = DriverManager.getConnection("jdbc:duckdb:");
     Statement stmt = conn.createStatement()) {

    // Create test data
    stmt.execute("CREATE TABLE sales AS " +
        "SELECT * FROM (VALUES " +
        "('2026-01-01'::DATE, 'Product A', 1200.00), " +
        "('2026-01-02'::DATE, 'Product B', 850.00), " +
        "('2026-01-03'::DATE, 'Product A', 1500.00)" +
        ") AS t(sale_date, product, amount)");

    // Run analytical query
    ResultSet rs = stmt.executeQuery(
        "SELECT product, COUNT(*) AS orders, SUM(amount) AS total " +
        "FROM sales GROUP BY product ORDER BY total DESC"
    );

    while (rs.next()) {
        System.out.printf(
            "Product: %s, Orders: %d, Total: $%.2f%n",
            rs.getString("product"),
            rs.getInt("orders"),
            rs.getDouble("total")
        );
    }
}

Step 3: Advanced JDBC Usage

Using PreparedStatement

String sql = "SELECT department, AVG(salary) AS avg_salary " +
             "FROM employees WHERE salary > ? GROUP BY department";

try (Connection conn = DriverManager.getConnection("jdbc:duckdb:");
     PreparedStatement pstmt = conn.prepareStatement(sql)) {

    pstmt.setDouble(1, 90000);
    ResultSet rs = pstmt.executeQuery();

    while (rs.next()) {
        System.out.printf("%s: $%.2f%n",
            rs.getString("department"),
            rs.getDouble("avg_salary"));
    }
}

Query Files Directly

DuckDB Java’s greatest advantage — query external files without importing.

// Query CSV files
ResultSet rs = stmt.executeQuery(
    "SELECT region, SUM(revenue) AS total " +
    "FROM read_csv_auto('/data/sales_2026.csv') " +
    "GROUP BY region"
);

// Query Parquet files
ResultSet rs2 = stmt.executeQuery(
    "SELECT date_trunc('month', order_date) AS month, " +
    "       COUNT(*) AS orders " +
    "FROM '/data/orders.parquet' " +
    "GROUP BY month"
);

Batch Insert

conn.setAutoCommit(false);
PreparedStatement pstmt = conn.prepareStatement(
    "INSERT INTO logs VALUES (?, ?, ?)"
);

for (LogEntry entry : logBatch) {
    pstmt.setInt(1, entry.getId());
    pstmt.setString(2, entry.getLevel());
    pstmt.setString(3, entry.getMessage());
    pstmt.addBatch();
}

int[] results = pstmt.executeBatch();
conn.commit();

Step 4: DuckDB vs H2 Comparison

For Java developers, DuckDB and H2 are the two most common embedded databases. Here’s a head-to-head comparison:

Performance

FeatureDuckDBH2
Storage engineColumnarRow-based
Analytical queries⚡ 10-100× fasterSlower
Single-row queriesSlower⚡ Fast
Concurrent writesSingle writerMulti-writer
File queriesNative CSV/Parquet/JSONRequires import

Use Case Comparison

// DuckDB: analytical queries, large file processing
String analyticsSQL = """
    SELECT category,
           SUM(amount) AS total,
           AVG(amount) AS avg,
           PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) AS median
    FROM read_csv_auto('sales_large.csv')
    GROUP BY category
    """;

// H2: OLTP workloads, web app transactions
String oltpSQL = """
    UPDATE users SET last_login = NOW()
    WHERE user_id = ?
    """;

Migration Guide

  • Complex analytics, large file processing, columnar storage → Choose DuckDB
  • High-concurrency transactions, row-level updates, web app backend → Choose H2
  • Best practice: H2 for transactions + DuckDB for analytics — they work great together

Complete Example: Analytics Application

import java.sql.*;
import java.util.Properties;

public class DuckDBAnalytics {
    public static void main(String[] args) throws Exception {
        // Configure DuckDB
        Properties props = new Properties();
        props.setProperty("threads", "4"); // parallelism

        try (Connection conn = DriverManager.getConnection(
                "jdbc:duckdb:", props);
             Statement stmt = conn.createStatement()) {

            // 1. Load extensions
            stmt.execute("INSTALL httpfs");
            stmt.execute("LOAD httpfs");

            // 2. Query Parquet files from S3
            ResultSet rs = stmt.executeQuery("""
                SELECT 
                    date_trunc('month', order_date) AS month,
                    product_category,
                    SUM(order_amount) AS revenue,
                    COUNT(*) AS transactions
                FROM read_parquet('s3://my-bucket/orders/*.parquet')
                WHERE order_date >= '2026-01-01'
                GROUP BY ALL
                ORDER BY month, revenue DESC
                LIMIT 20
            """);

            while (rs.next()) {
                System.out.printf("%s | %s | $%.2f | %d%n",
                    rs.getDate("month"),
                    rs.getString("product_category"),
                    rs.getDouble("revenue"),
                    rs.getInt("transactions"));
            }
        }
    }
}

Common Issues

1. JDBC Driver Class Not Found

Ensure the dependency is correctly configured:

<dependency>
    <groupId>org.duckdb</groupId>
    <artifactId>duckdb_jdbc</artifactId>
    <version>1.2.0</version>
</dependency>

2. Thread Safety

DuckDB JDBC connections are thread-safe, but it’s recommended to use separate connections per thread or a connection pool.

3. Memory Limits

// Set maximum memory
props.setProperty("memory_limit", "4GB");

Architecture Overview


📘 Blog: https://duckdblab.org #DuckDB #Java #JDBC #Database #Tutorial

📺 Watch video tutorials → DuckDB Lab YouTube

Subscribe for more DuckDB & AI automation tutorials

Built with Hugo
Theme Stack designed by Jimmy