Early filtering

Reducing the amount of data processed early in a query can significantly cut execution time and resource use.

Filter as Early as Possible

Avoid applying filters after multiple JOINs or in outer queries. This can confuse the optimizer and prevent index usage.

Bad:

SELECT * 
FROM (
  SELECT * FROM customers JOIN orders ON ...
) 
WHERE country = 'DE';

Better:

SELECT * 
FROM customers 
JOIN orders ON ...
WHERE country = 'DE';

Avoid SELECT *

CrateDB is a columnar database. Selecting only the columns you need reduces disk I/O and memory usage.

Bad:

Better:

Avoid Large Result Sets

CrateDB excels at processing large datasets but is not optimized for returning large volumes of raw results. Use LIMIT or pagination when working with hundreds of thousands of rows.

See Fetching Large Result Sets from CrateDBarrow-up-right for batching examples.

Propagate LIMIT Clauses When Possible

Apply LIMIT early in subqueries when possible to reduce intermediate processing load.

Instead of:

Use:

Filter Before UNNEST

When working with arrays of objects, add a filter condition before UNNEST to minimize unnecessary data expansion.

Bad:

Better:

Last updated