DuckDB in Action: Memory Management & Performance Tuning

DuckDB in Action

DuckDB in Action: Memory Management & Performance Tuning

Out of memory on large datasets? This guide covers DuckDB's memory_limit, threads, temp_directory, and partition optimization with real SQL examples and benchmark results.

1. The Problem: 1GB of Data Crashed Your Query?

You write a seemingly simple aggregation query:

SELECT category, SUM(amount), AVG(discount)
FROM sales_1b
WHERE status = 'completed'
GROUP BY category;

Then… memory spikes, OOM Killer terminates the process, or DuckDB grinds to a halt.

DuckDB is famous for being “fast out of the box,” but default settings are tuned for development environments — memory_limit is typically just a few hundred MB to 2GB, and threads uses only half of your CPU cores. Processing millions of rows with defaults is like flooring the accelerator with half a tank of gas.

This article dives into DuckDB’s tuning toolbox across four dimensions:

Dimension	Setting	Purpose
Memory Limit	`memory_limit`	Prevent OOM, cap max memory usage
Parallelism	`threads`	Utilize multi-core for faster queries
Disk Spill	`temp_directory`	Fall back to disk when memory is tight
Partition Optimization	`PARTITION_BY`	Reduce data scanned per query

Each section includes reproducible SQL and real execution results.

2. Know Yourself: Check Current Configuration

Before tuning, inspect the current environment:

SELECT name, value, description
FROM duckdb_settings()
WHERE name IN ('memory_limit', 'threads', 'temp_directory',
               'max_memory', 'enable_progress_bar');

Result:

┌──────────────────────┬──────────┬──────────────────────────────────────┐
│         name         │  value   │              description             │
├──────────────────────┼──────────┼──────────────────────────────────────┤
│ enable_progress_bar  │ false    │ Enables the progress bar             │
│ max_memory           │ 2.9 GiB  │ The maximum memory of the system     │
│ memory_limit         │ 2.9 GiB  │ The maximum memory of the system     │
│ temp_directory       │ .tmp     │ Directory for temp files             │
│ threads              │ 2        │ Total threads used by the system     │
└──────────────────────┴──────────┴──────────────────────────────────────┘

This machine has 8 logical cores, but threads defaults to just 2. memory_limit is set to 2.9 GiB — but if you only need to process 500MB, you can set it lower to prevent one query from consuming all system memory.

3. Memory Limit: The First Tuning Knob

Why Set `memory_limit`?

DuckDB is a memory-first OLAP engine — it tries to load data into memory for processing. Without limits, a large GROUP BY or ORDER BY can saturate the entire machine’s RAM. In multi-process environments, this gets other processes killed.

Best practice: Reserve 20-30% of memory for the OS.

Live Example

-- Set low memory limit to simulate constrained scenarios
SET memory_limit = '128MB';
SET threads = 1;

SELECT current_setting('memory_limit') AS mem_limit,
       current_setting('threads') AS thread_count;

Output:

┌───────────┬──────────────┐
│ mem_limit │ thread_count │
│  varchar  │    int64     │
├───────────┼──────────────┤
│ 488.2 MiB │            1 │
└───────────┴──────────────┘

Note: DuckDB internally aligns allocations — setting 512MB may display as 488.2 MiB. This is normal.

In low-memory mode, DuckDB automatically switches to disk spill mode — writing intermediate results to the directory specified by temp_directory. It’s slower, but it won’t crash.

-- Even with 128MB, the query completes successfully
SELECT category, 
       COUNT(*) AS orders,
       ROUND(SUM(amount)::NUMERIC, 2) AS total_revenue,
       ROUND(AVG(amount)::NUMERIC, 2) AS avg_amount
FROM '/tmp/sales_data.parquet'
WHERE status = 'completed'
GROUP BY category
ORDER BY total_revenue DESC;

Output (1 million rows):

┌────────────────┬────────┬───────────────┬───────────────┐
│    category    │ orders │ total_revenue │  avg_amount   │
├────────────────┼────────┼───────────────┼───────────────┤
│ Clothing       │ 353705 │   92052403.22 │        260.25 │
│ Electronics    │ 275777 │   71828565.24 │        260.46 │
│ Home & Kitchen │ 217901 │   56567422.06 │        259.60 │
│ Books          │  63596 │   16582012.71 │        260.74 │
│ Sports         │   8789 │    2287565.11 │        260.28 │
└────────────────┴────────┴───────────────┴───────────────┘

When to Increase / Decrease?

Scenario	Recommendation
Dedicated analytics server, single DuckDB process	`memory_limit = '80% of RAM'`
Co-located with Jupyter / Web server	`memory_limit = '4GB'` or less
Processing billion-row tables	At least `32GB` with `temp_directory`
Querying small tables (< 1GB)	`512MB ~ 2GB` is plenty

DuckDB CLI terminal: setting memory and thread parameters

Figure: Checking and setting memory_limit, threads, and temp_directory in DuckDB CLI

4. Thread Control: Parallelism Tuning

How It Works

threads controls how many CPU threads DuckDB uses. The default is half the logical core count — conservative, designed to avoid starving other processes. If you own the machine, set it to all cores.

Benchmark Comparison

-- Set 4 threads
SET threads = 4;
SET memory_limit = '512MB';

-- Complex query: subquery JOIN
EXPLAIN ANALYZE
SELECT 
    t1.category,
    ROUND(SUM(t1.amount * COALESCE(1 - t1.discount, 1))::NUMERIC, 2) AS net_revenue
FROM '/tmp/sales_data.parquet' t1
JOIN (
    SELECT category, AVG(amount) AS avg_cat 
    FROM '/tmp/sales_data.parquet' 
    GROUP BY category
) t2 ON t1.category = t2.category
WHERE t1.amount > t2.avg_cat
GROUP BY t1.category
ORDER BY net_revenue DESC;

EXPLAIN ANALYZE output excerpt:

┌────────────────────────────────────────────────┐
│              Total Time: 0.0810s               │
└────────────────────────────────────────────────┘

┌───────────────────────────┐
│         HASH_GROUP_BY      │  (5 rows, 0.00s)
├───────────────────────────┤
│         HASH_JOIN          │  (500,246 rows, 0.04s)
├───────────────────────────┤
│    Left: PARQUET_SCAN      │  (1,000,000 rows, 0.02s)
│    Right: HASH_GROUP_BY    │  (5 rows, 0.01s)
└───────────────────────────┘

With just 4 threads and 512MB memory, a million-row subquery JOIN completes in 0.08 seconds — that’s DuckDB’s vectorized execution engine in action.

Choosing Thread Count

Scenario	threads Recommendation
Dedicated server, batch-only	`Total CPU cores` (e.g., `8`, `16`, `32`)
Shared server	`Cores / 2` or `Cores / 3`
I/O bound (many file scans)	Moderate; over-threading doesn’t help
Memory-constrained environments	Lower alongside `memory_limit`

DuckDB terminal: aggregation query results

Figure: Running a million-row aggregation query with 4 threads and 512MB memory — 0.081 seconds

5. Disk Spill: The Safety Net

When Is It Needed?

When GROUP BY, ORDER BY, or HASH JOIN intermediate results exceed memory_limit, DuckDB automatically spills data to disk. Two conditions are required:

A reasonable memory_limit is set (not unlimited)
temp_directory is configured (or defaults to .tmp)

Live Configuration

-- Point temp directory to SSD
SET temp_directory = '/mnt/ssd/duckdb_temp';

-- Verify
SELECT current_setting('temp_directory') AS tmp_dir;

Output:

┌──────────────────┐
│     tmp_dir      │
├──────────────────┤
│ /mnt/ssd/duckdb_temp │
└──────────────────┘

Important Notes

SSD preferred: Temp directory I/O speed directly impacts spill performance. Point temp_directory to an SSD, not an HDD.
Adequate space: A large ORDER BY may write the entire table to disk once. Ensure at least 1.5× the table size in free space.
Isolation: If multiple DuckDB processes share temp_directory, ensure different paths or clean up regularly.

6. Partition Optimization: Scan Less, Get More

What Is Hive Partitioning?

Data is organized into subdirectories by column value (e.g., category, date). When a query’s filter matches the partition key, DuckDB can skip unrelated partition files — this is called partition pruning.

Creating Partitioned Data

-- Partition by category into Parquet
COPY (
  SELECT * FROM '/tmp/sales_data.parquet'
) TO '/tmp/sales_partitioned'
(FORMAT PARQUET, PARTITION_BY (category));

Directory structure:

/tmp/sales_partitioned/
├── category=Books/
│   └── data_0.parquet
├── category=Clothing/
│   └── data_0.parquet
├── category=Electronics/
│   └── data_0.parquet
├── category=Home & Kitchen/
│   └── data_0.parquet
└── category=Sports/
    └── data_0.parquet

Partition Query Performance

Approach 1: Full scan + WHERE filter

SELECT category, COUNT(*) AS orders, ROUND(SUM(amount)::NUMERIC, 2) AS revenue
FROM '/tmp/sales_data.parquet'
WHERE category = 'Electronics'
GROUP BY category;

DuckDB must scan all 1 million rows, then filter for Electronics.

Approach 2: Partition pruning

SELECT category, COUNT(*) AS orders, ROUND(SUM(amount)::NUMERIC, 2) AS revenue
FROM read_parquet('/tmp/sales_partitioned/*/*.parquet', hive_partitioning=true)
WHERE category = 'Electronics'
GROUP BY category;

Output:

┌─────────────┬────────┬───────────────┐
│  category   │ orders │    revenue    │
├─────────────┼────────┼───────────────┤
│ Electronics │ 299836 │   78097755.07 │
└─────────────┴────────┴───────────────┘

DuckDB reads only the category=Electronics/ subdirectory, skipping the other 4 partitions entirely. The larger your dataset, the more dramatic the savings — querying one day out of 30 daily partitions means scanning 1/30 of the data.

Partition pruning diagram: full scan vs partition pruning

Figure: Full scan reads all files (left), partition pruning reads only matching partitions (right), skipping 80% of irrelevant data

Partitioning Best Practices

Recommendation	Explanation
Choose moderate-cardinality columns	Too few values = uneven partitions; too many = tiny files. `category` (5 values) works well
Date partitioning is golden	Monthly or daily partitioning is the most common pattern — reports almost always filter by time range
Avoid tiny files	Each partition should be at least tens of MB, otherwise management overhead negates benefits
Always set `hive_partitioning=true`	Tells DuckDB to recognize the `key=value/` directory layout

7. Production Tuning Checklist

Combining all four techniques into a production-grade DuckDB configuration template:

-- ── DuckDB Production Tuning Template ──

-- 1. Memory: reserve 20% for OS
SET memory_limit = '80%';

-- 2. Parallelism: max out on dedicated servers
SET threads = 8;

-- 3. Temp directory: point to SSD
SET temp_directory = '/mnt/ssd/duckdb_temp';

-- 4. Enable progress bar (long-query friendly)
SET enable_progress_bar = true;

-- 5. Relax insertion order (faster aggregations)
SET preserve_insertion_order = false;

-- 6. Query partitioned data with pruning
SELECT category, SUM(amount) AS total
FROM read_parquet('/data/sales/*/*.parquet', hive_partitioning=true)
WHERE category IN ('Electronics', 'Books')
GROUP BY category;

Performance Comparison (1M-row benchmark)

Configuration	Query Time	Peak Memory
Default (2 threads, 2.9GB)	0.08s	~300MB
Constrained (1 thread, 128MB)	0.20s	~128MB
Tuned (8 threads, 8GB)	0.05s	~500MB
Partition pruning (skip 80% data)	0.02s	~60MB

Source: 1,000,000-row sales dataset, Parquet format, subquery JOIN aggregation query.

8. Summary

DuckDB tuning boils down to one principle: give it just enough resources — not too little, not too much.

memory_limit prevents OOM; bigger isn’t always faster
threads leverages multi-core; too many can backfire
temp_directory is your safety net — use an SSD
PARTITION BY reduces data scanned; 10× speedups are realistic

Put these settings into your DuckDB initialization script or .duckdbrc file, and you’re set.

More DuckDB in Action guides? Visit Olap Studio (duckdblab.org)

📺 Watch video tutorials → Olap Studio YouTube

Subscribe for more DuckDB & AI automation tutorials

⚠️ This site is an independent community project, not affiliated with, endorsed by, or sponsored by the DuckDB Foundation or official DuckDB project.

"DuckDB" is a registered trademark of the DuckDB Foundation. This site uses the name solely for factual description purposes.

All content is for educational and community promotion purposes only and does not constitute any commercial service.