Common Table Expressions (CTEs)
Use this lesson to understand Common Table Expressions (CTEs) with practical syntax and examples.
Concept Overview
A Common Table Expression (CTE) is a named, reusable subquery defined with WITH that exists for the duration of a single SQL statement.
CTEs help you write SQL as a sequence of readable steps:
- extract a subset (paid orders)
- transform it (monthly revenue)
- filter and present it (top months)
Why is it important?
- Readability: breaks complex queries into named, reviewable chunks
- Correctness: encourages you to validate each step and keep logic explicit
- Reusability within a statement: reference the same intermediate set multiple times
- Planner control (PostgreSQL):
MATERIALIZED/NOT MATERIALIZEDcan influence inlining behavior
Where does it fit?
CTEs are part of advanced query structuring. They appear in analytics queries, ETL jobs, data audits, and migrations. Recursive CTEs (WITH RECURSIVE) build on this idea for hierarchical traversal.
Syntax & Rules
Core Syntax
WITH cte_name AS (
SELECT ...
)
SELECT ...
FROM cte_name;
Multiple CTEs:
WITH a AS (
SELECT ...
),
b AS (
SELECT ... FROM a
)
SELECT ...
FROM b;
Available Options / Parameters
| Feature | Syntax | Why you use it |
|---|---|---|
| CTE definition | WITH name AS (...) | name an intermediate result |
| Multiple CTEs | WITH a AS (...), b AS (...) | multi-step transformations |
| Recursive CTE | WITH RECURSIVE ... | hierarchies and iterative expansion |
| Materialization hint | WITH MATERIALIZED name AS (...) | force evaluation once, then reuse |
| Inline hint | WITH NOT MATERIALIZED name AS (...) | encourage planner to inline |
Key Rules and Considerations
- A CTE is scoped to a single statement.
- CTEs are not persisted objects (unlike tables or views).
- In modern PostgreSQL, a CTE may be inlined by the planner when safe.
- Use
MATERIALIZEDwhen you intentionally want to compute the CTE once and reuse it (or to create a planning “fence”). - Use
NOT MATERIALIZEDwhen the CTE is simple and you want the planner to optimize across it.
Step-by-Step Examples
Example 1: Break a Query into Steps (Beginner)
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
status text NOT NULL,
amount numeric(12,2) NOT NULL CHECK (amount >= 0),
created_at date NOT NULL
);
INSERT INTO orders (customer_id, status, amount, created_at) VALUES
(10, 'paid', 25.00, '2026-01-01'),
(10, 'paid', 10.00, '2026-01-02'),
(11, 'paid', 50.00, '2026-01-03'),
(12, 'failed', 20.00, '2026-01-03'),
(13, 'paid', 40.00, '2026-02-01');
WITH paid_orders AS (
SELECT customer_id, amount, created_at
FROM orders
WHERE status = 'paid'
)
SELECT customer_id, SUM(amount) AS revenue
FROM paid_orders
GROUP BY customer_id
ORDER BY customer_id;
Expected output:
customer_id | revenue
-------------+---------
10 | 35.00
11 | 50.00
13 | 40.00
(3 rows)
Explanation:
- The CTE isolates “paid” logic so the main query reads like a report.
Example 2: Multiple CTEs as a Pipeline (Intermediate)
WITH paid_orders AS (
SELECT customer_id, amount, created_at
FROM orders
WHERE status = 'paid'
),
monthly AS (
SELECT
date_trunc('month', created_at::timestamptz) AS month,
SUM(amount) AS revenue
FROM paid_orders
GROUP BY 1
)
SELECT month, revenue
FROM monthly
ORDER BY month;
Expected output:
month | revenue
------------------------+---------
2026-01-01 00:00:00+00 | 85.00
2026-02-01 00:00:00+00 | 40.00
(2 rows)
Explanation:
- First step filters to paid orders.
- Second step aggregates by month.
Example 3: Reuse the Same CTE More Than Once (Advanced)
If you need to reference the same intermediate set multiple times, a CTE keeps it defined in one place.
WITH paid_orders AS (
SELECT customer_id, amount
FROM orders
WHERE status = 'paid'
)
SELECT
(SELECT COUNT(*) FROM paid_orders) AS paid_order_count,
(SELECT SUM(amount) FROM paid_orders) AS paid_revenue;
Expected output:
paid_order_count | paid_revenue
------------------+-------------
4 | 125.00
(1 row)
Explanation:
- This pattern can also be written with aggregates in one SELECT, but it demonstrates multi-reference reuse.
Example 4: CTE + DML with RETURNING (Advanced)
CTEs can stage a target set for UPDATE/DELETE and return affected rows.
CREATE TABLE jobs (
job_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
status text NOT NULL,
run_after timestamptz,
locked_at timestamptz
);
INSERT INTO jobs (status, run_after, locked_at) VALUES
('queued', now() - interval '5 minutes', NULL),
('queued', now() + interval '5 minutes', NULL),
('queued', now() - interval '1 minute', NULL);
WITH ready AS (
SELECT job_id
FROM jobs
WHERE status = 'queued'
AND (run_after IS NULL OR run_after <= now())
AND locked_at IS NULL
ORDER BY job_id
LIMIT 2
)
UPDATE jobs j
SET locked_at = now()
FROM ready r
WHERE j.job_id = r.job_id
RETURNING j.job_id, j.status, j.locked_at;
Expected output (shape):
job_id | status | locked_at
--------+--------+------------------------
1 | queued | <timestamp>
3 | queued | <timestamp>
(2 rows)
Explanation:
- The CTE selects a batch.
- The
UPDATE ... FROMapplies the change to that batch. RETURNINGverifies which rows were updated.
Practical Use Cases
1) Multi-step reporting queries
WITH base AS (
SELECT * FROM orders WHERE created_at >= current_date - 30
)
SELECT customer_id, COUNT(*)
FROM base
GROUP BY customer_id;
2) Debugging intermediate states
WITH step AS (
SELECT * FROM events WHERE payload @> '{"type":"purchase"}'
)
SELECT COUNT(*) FROM step;
3) Batch updates/deletes
WITH batch AS (
SELECT id FROM audit_events ORDER BY id LIMIT 10000
)
DELETE FROM audit_events e
USING batch b
WHERE e.id = b.id;
4) Dedup pipelines
WITH ranked AS (
SELECT id, row_number() OVER (PARTITION BY email ORDER BY created_at DESC) AS rn
FROM users
)
DELETE FROM users u
USING ranked r
WHERE u.id = r.id AND r.rn > 1;
5) Control planner behavior (rare, but useful)
WITH MATERIALIZED expensive AS (
SELECT ...
)
SELECT ... FROM expensive;
Common Mistakes & Troubleshooting
1) Thinking a CTE persists
Wrong mental model:
- expecting
WITH x AS (...)to create a reusable object.
Bad outcome:
- Query fails if you reference
xin a later statement.
Fix:
- Use a view (
CREATE VIEW ...) if you need persistence.
2) Using CTEs everywhere without measuring
Wrong approach:
- rewriting every subquery into a CTE.
Bad outcome:
- Harder to reason about performance; may block some optimizations depending on version and hints.
Fix:
- Use CTEs when they improve clarity or reuse, then validate performance with
EXPLAIN.
3) Repeating logic in multiple CTEs
Wrong approach:
- copying the same filter into multiple steps.
Bad outcome:
- drift and bugs when one copy changes.
Fix:
- Put the logic in one CTE and reference it.
4) Misusing MATERIALIZED/NOT MATERIALIZED
Bad outcome:
- Forcing materialization can increase memory and I/O.
- Forcing inlining can duplicate work if referenced multiple times.
Fix:
- Use hints only when you have evidence (plan inspection) that they help.
Debugging checklist:
- Run each CTE body as a standalone
SELECTto validate intermediate results. - Add
LIMITto check shapes and edge cases. - Use
EXPLAIN (ANALYZE, BUFFERS)for performance-critical queries. - If the same CTE is referenced multiple times, consider
MATERIALIZED. - If a simple CTE prevents optimization, consider
NOT MATERIALIZEDor inline it.
Best Practices
- ✅ Name CTEs after what they represent (
paid_orders,monthly_revenue). ❌ Avoid generic names (tmp,t1) in complex queries. - ✅ Keep each CTE focused on one transformation. ❌ Avoid 10-layer CTE chains without clear purpose.
- ✅ Prefer
FILTERfor conditional aggregates inside CTE pipelines. ❌ Avoid duplicating almost-identical CTEs for each condition. - ✅ Use
RETURNINGwith CTE-driven DML to verify changes. ❌ Avoid bulk updates without visibility into affected rows. - ✅ Measure performance when it matters.
❌ Avoid using
MATERIALIZED/NOT MATERIALIZEDas a default style rule.
Hands-On Practice
Use this setup for the exercises:
CREATE TABLE practice_orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
status text NOT NULL,
amount numeric(12,2) NOT NULL,
created_at date NOT NULL
);
INSERT INTO practice_orders (customer_id, status, amount, created_at) VALUES
(1, 'paid', 10.00, current_date),
(1, 'paid', 20.00, current_date),
(2, 'failed', 5.00, current_date);
Exercise 1 (Easy): One CTE
Task: Create a CTE of paid orders and return revenue per customer.
-- Your SQL here
Solution:
WITH paid AS (
SELECT customer_id, amount
FROM practice_orders
WHERE status = 'paid'
)
SELECT customer_id, SUM(amount) AS revenue
FROM paid
GROUP BY customer_id
ORDER BY customer_id;
Exercise 2 (Medium): Two-step pipeline
Task: Use two CTEs: one for paid orders, one that aggregates per customer.
-- Your SQL here
Solution:
WITH paid AS (
SELECT customer_id, amount
FROM practice_orders
WHERE status = 'paid'
),
per_customer AS (
SELECT customer_id, SUM(amount) AS revenue
FROM paid
GROUP BY customer_id
)
SELECT *
FROM per_customer
ORDER BY revenue DESC;
Exercise 3 (Advanced): Reuse a CTE twice
Task: Compute both the count and total amount of paid orders by referencing the same CTE twice.
-- Your SQL here
Solution:
WITH paid AS (
SELECT amount
FROM practice_orders
WHERE status = 'paid'
)
SELECT
(SELECT COUNT(*) FROM paid) AS paid_count,
(SELECT SUM(amount) FROM paid) AS paid_total;
Connection to Other Concepts
| Concept | Why it matters |
|---|---|
| Recursive CTEs | extend WITH to iterative/hierarchical queries |
| Temporary tables | alternative for multi-statement staging |
| Window functions | often used inside CTE pipelines for ranking/dedup |
| Transactions | wrap multi-step DML for safety |
EXPLAIN | validate performance of complex CTE queries |
Visual Learning Diagram
flowchart LR
A[Base Tables] --> B[CTE Step 1]
B --> C[CTE Step 2]
C --> D[Final SELECT]
B --> E[MATERIALIZED?]
E --> F[Reuse]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F allNodes
class B highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Treating CTEs as persisted objects | missing table errors | remember scope is one statement |
| Over-stacking CTE layers | hard-to-maintain queries | keep each step meaningful |
| Copy/paste logic across steps | drift and bugs | reuse one CTE instead |
| Incorrect hint usage | worse performance | add hints only with evidence |
| Not verifying intermediate results | silent errors | run CTE bodies standalone while debugging |
Quick Reference
WITH cte AS (SELECT ...) SELECT * FROM cte;
WITH a AS (...), b AS (...) SELECT * FROM b;
WITH MATERIALIZED x AS (SELECT ...) SELECT ... FROM x;
WITH NOT MATERIALIZED x AS (SELECT ...) SELECT ... FROM x;
WITH x AS (SELECT ...) UPDATE t SET ... FROM x WHERE ... RETURNING ...;
What's Next
- Next: Recursive CTE Patterns - Continue to the next concept with incremental complexity.
- Module Overview - Return to this module index and choose another related lesson.