CTEs vs JOINS

In some scenarios, especially with large or unevenly distributed tables, rewriting complex JOINs as Common Table Expressions (CTEs) can lead to more consistent and faster execution times.

This technique works by breaking down the query into smaller, focused parts—making it easier for the query planner to optimize execution. However, the effectiveness of this approach depends on the structure and distribution of your data.

For a deeper explanation, see the article on speeding up queries with CTEs.

Original Version (with JOINs)

SELECT SUM(quantity)
FROM invoices
JOIN invoice_items USING (invoice_number)
JOIN products USING (product_id)
WHERE product_description = 'super cool product'
  AND invoices.issue_date BETWEEN '2024-01-01' AND '2024-02-01';

This query joins three tables directly. On large datasets, this can trigger expensive join reordering or inefficient execution plans.

Improved Version (using CTEs)

WITH relevant_product_ids AS (
    SELECT product_id
    FROM products
    WHERE product_description = 'super cool product'
),
relevant_invoice_lines AS (
    SELECT invoice_number, quantity
    FROM invoice_items
    WHERE product_id IN (
        SELECT product_id FROM relevant_product_ids
    )
),
relevant_invoices AS (
    SELECT invoice_number, issue_date
    FROM invoices
    WHERE invoice_number IN (
        SELECT invoice_number FROM relevant_invoice_lines
    )
)
SELECT SUM(quantity)
FROM relevant_invoices
JOIN relevant_invoice_lines USING (invoice_number)
WHERE issue_date BETWEEN '2024-01-01' AND '2024-02-01';

This version isolates each filtering step:

  • First, find the matching product IDs.

  • Then, filter invoice items by those products.

  • Finally, restrict invoices to the relevant subset.

By doing this:

  • You reduce the amount of data involved in each step.

  • The optimizer has fewer paths to evaluate.

  • The overall query becomes easier to reason about and debug.

When to Use This Pattern

Use CTE-based rewrites when:

  • You're joining large tables with selective filters.

  • The planner appears to pick inefficient join orders.

  • You notice high variance in execution times for similar queries.

Avoid if:

  • The JOIN strategy is already optimal.

  • Your tables are small and well-indexed.

  • You need the absolute shortest execution time and the overhead of subqueries outweighs benefits.

Last updated