Recursive CTE Patterns
Use this lesson to understand Recursive CTE Patterns with practical syntax and examples.
Concept Overview
Recursive CTEs (WITH RECURSIVE) let you iterate in SQL: you start from an anchor set of rows and repeatedly expand the result by joining back to a base table until no new rows are produced.
Common use cases:
- org charts (manager -> reportees)
- category trees (parent -> children)
- dependency chains (job -> downstream jobs)
- graph walks (limited depth)
Why is it important?
- Hierarchies are common: many schemas use adjacency lists (
parent_id) inside a single table - Correctness: recursion makes the termination condition explicit (depth, visited nodes)
- Performance: recursive queries can be efficient when the relationship key is indexed
Where does it fit?
Recursive CTEs build on normal CTEs. They are part of advanced SQL and connect to self joins, indexing, and window functions (for ordering and ranking results).
Syntax & Rules
Core Syntax
WITH RECURSIVE cte_name AS (
-- 1) Anchor term (starting rows)
SELECT ...
UNION ALL
-- 2) Recursive term (references cte_name)
SELECT ...
FROM base_table
JOIN cte_name ON ...
WHERE ...
)
SELECT *
FROM cte_name;
Available Options / Parameters
| Component | Purpose | Typical pattern |
|---|---|---|
| Anchor term | starting rows | WHERE parent_id IS NULL |
| Recursive term | expands from previous level | join base table to the CTE |
| UNION ALL | appends new rows | duplicates can be controlled via visited set |
| Termination | stops expansion | depth guard, visited guard, or no matches |
| Path / depth | diagnostics | depth + 1, `path |
Key Rules and Considerations
- Anchor and recursive terms must return the same columns with compatible types.
- Recursion stops when the recursive term produces zero rows.
- Always think about termination: depth limits and cycle prevention.
- Index the relationship column (for example,
manager_id,parent_id).
Step-by-Step Examples
Example 1: Traverse a Manager Hierarchy (Beginner)
CREATE TABLE staff (
staff_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
full_name text NOT NULL,
manager_id bigint REFERENCES staff(staff_id)
);
INSERT INTO staff (full_name, manager_id) VALUES
('CEO', NULL),
('VP Eng', 1),
('Eng Manager', 2),
('Engineer 1', 3),
('Engineer 2', 3),
('VP Sales', 1);
-- All descendants under VP Eng (staff_id = 2)
WITH RECURSIVE reports AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE staff_id = 2
UNION ALL
SELECT s.staff_id, s.full_name, s.manager_id, r.depth + 1
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
)
SELECT staff_id, full_name, depth
FROM reports
ORDER BY depth, staff_id;
Expected output:
staff_id | full_name | depth
----------+---------------+------
2 | VP Eng | 0
3 | Eng Manager | 1
4 | Engineer 1 | 2
5 | Engineer 2 | 2
(4 rows)
Explanation:
- The anchor selects the starting node (VP Eng).
- The recursive term finds direct reportees of the previous level.
Example 2: Add a Path for Debugging and Display (Intermediate)
WITH RECURSIVE reports AS (
SELECT
staff_id,
full_name,
manager_id,
0 AS depth,
ARRAY[staff_id] AS path
FROM staff
WHERE staff_id = 2
UNION ALL
SELECT
s.staff_id,
s.full_name,
s.manager_id,
r.depth + 1,
r.path || s.staff_id
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
)
SELECT staff_id, full_name, depth, path
FROM reports
ORDER BY depth, staff_id;
Expected output:
staff_id | full_name | depth | path
----------+---------------+-------+-----------
2 | VP Eng | 0 | {2}
3 | Eng Manager | 1 | {2,3}
4 | Engineer 1 | 2 | {2,3,4}
5 | Engineer 2 | 2 | {2,3,5}
(4 rows)
Explanation:
- The
patharray is useful for debugging and for cycle prevention.
Example 3: Cycle Prevention (Advanced)
Cycles can exist in bad data (A manages B, B manages A). You can prevent infinite-like recursion by tracking visited IDs.
-- Introduce a cycle (DO NOT do this in real schemas)
UPDATE staff SET manager_id = 5 WHERE staff_id = 2;
WITH RECURSIVE walk AS (
SELECT
staff_id,
manager_id,
ARRAY[staff_id] AS path
FROM staff
WHERE staff_id = 2
UNION ALL
SELECT
s.staff_id,
s.manager_id,
w.path || s.staff_id
FROM staff s
JOIN walk w ON s.manager_id = w.staff_id
WHERE NOT s.staff_id = ANY(w.path)
)
SELECT staff_id, manager_id, path
FROM walk;
Expected output (shape):
staff_id | manager_id | path
----------+------------+---------
2 | 5 | {2}
3 | 2 | {2,3}
4 | 3 | {2,3,4}
5 | 3 | {2,3,5}
(4 rows)
Explanation:
- The
WHERE NOT s.staff_id = ANY(w.path)check blocks revisiting nodes.
Example 4: Depth Guard (Advanced)
Depth guards prevent unexpected huge expansions.
WITH RECURSIVE reports AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE staff_id = 1
UNION ALL
SELECT s.staff_id, s.full_name, s.manager_id, r.depth + 1
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
WHERE r.depth < 5
)
SELECT staff_id, full_name, depth
FROM reports
ORDER BY depth, staff_id;
Expected outcome:
- Stops expanding after 5 levels even if the tree is deeper.
Practical Use Cases
1) Category trees
WITH RECURSIVE subtree AS (
SELECT category_id, parent_category_id
FROM categories
WHERE category_id = 42
UNION ALL
SELECT c.category_id, c.parent_category_id
FROM categories c
JOIN subtree s ON c.parent_category_id = s.category_id
)
SELECT * FROM subtree;
2) Dependency chains
WITH RECURSIVE deps AS (
SELECT job_id, depends_on_job_id
FROM job_dependencies
WHERE job_id = 100
UNION ALL
SELECT d.job_id, d.depends_on_job_id
FROM job_dependencies d
JOIN deps x ON d.job_id = x.depends_on_job_id
)
SELECT * FROM deps;
3) Generate sequences iteratively (usually better with generate_series)
WITH RECURSIVE nums(n) AS (
SELECT 1
UNION ALL
SELECT n + 1 FROM nums WHERE n < 10
)
SELECT n FROM nums;
4) Find broken parent references
SELECT c.category_id
FROM categories c
LEFT JOIN categories p ON p.category_id = c.parent_category_id
WHERE c.parent_category_id IS NOT NULL
AND p.category_id IS NULL;
5) Flatten nested structures for reporting
WITH RECURSIVE org AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE manager_id IS NULL
UNION ALL
SELECT s.staff_id, s.full_name, s.manager_id, o.depth + 1
FROM staff s
JOIN org o ON s.manager_id = o.staff_id
)
SELECT * FROM org;
Common Mistakes & Troubleshooting
1) Anchor and recursive term return different columns
Wrong SQL:
WITH RECURSIVE x AS (
SELECT staff_id FROM staff
UNION ALL
SELECT staff_id, manager_id FROM staff
)
SELECT * FROM x;
Typical error:
ERROR: each UNION query must have the same number of columns
Fix:
- Ensure both terms return the same number of columns with compatible types.
2) No termination condition
Bad outcome:
- Very large outputs or long runtime.
Fix:
- Ensure the recursive join naturally stops producing new rows.
- Add depth guards or cycle checks if needed.
3) Cycles in data
Bad outcome:
- Recursion repeats paths.
Fix:
- Track visited IDs (path arrays) and block revisits.
4) Slow recursion on large tables
Bad outcome:
- Each recursive step becomes expensive.
Fix:
- Index the relationship column (for example,
manager_id). - Keep the recursive step narrow (select only needed columns).
Debugging checklist:
- Run the anchor query alone and verify the starting rows.
- Run the recursive term joined to a small sample of previous rows.
- Add a
depthcolumn and verify it increments. - Add
pathand cycle prevention if data quality is uncertain. - Use
EXPLAINand ensure the relationship key is indexed.
Best Practices
- ✅ Index parent/relationship columns used in recursion. ❌ Avoid recursive traversal on unindexed adjacency lists.
- ✅ Keep the recursive term lightweight. ❌ Avoid selecting wide payloads at every level.
- ✅ Add cycle prevention when data is not guaranteed acyclic. ❌ Avoid trusting data integrity without constraints.
- ✅ Add a depth guard in operational queries. ❌ Avoid unbounded recursion in production paths.
- ✅ Prefer
generate_series()for numeric/date sequences. ❌ Avoid recursive CTEs for simple sequence generation.
Hands-On Practice
Use this setup for the exercises:
CREATE TABLE practice_categories (
category_id bigint PRIMARY KEY,
parent_category_id bigint,
name text NOT NULL
);
INSERT INTO practice_categories (category_id, parent_category_id, name) VALUES
(1, NULL, 'Root'),
(2, 1, 'A'),
(3, 1, 'B'),
(4, 2, 'A1');
Exercise 1 (Easy): Traverse from root
Task: List all categories starting from category_id = 1 with depth.
-- Your SQL here
Solution:
WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, 0 AS depth
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.depth + 1
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
)
SELECT category_id, name, depth
FROM tree
ORDER BY depth, category_id;
Exercise 2 (Medium): Add a path
Task: Add a path array to the recursion.
-- Your SQL here
Solution:
WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, 0 AS depth, ARRAY[category_id] AS path
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.depth + 1, t.path || c.category_id
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
)
SELECT category_id, name, depth, path
FROM tree
ORDER BY depth, category_id;
Exercise 3 (Advanced): Prevent cycles
Task: Add a cycle prevention predicate that blocks revisiting category IDs.
-- Your SQL here
Solution:
WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, ARRAY[category_id] AS path
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.path || c.category_id
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
WHERE NOT c.category_id = ANY(t.path)
)
SELECT category_id, name, path
FROM tree;
Connection to Other Concepts
| Concept | Why it matters |
|---|---|
| Common Table Expressions (CTEs) | recursive CTEs extend the WITH idea |
| SELF JOIN | recursion is repeated self-joins across levels |
| Indexes | relationship key indexes determine performance |
| Window functions | useful for ordering/ranking flattened hierarchies |
| Constraints | enforcing FK relationships helps prevent broken trees |
Visual Learning Diagram
flowchart TD
A[Anchor Rows] --> B[Recursive Step]
B --> C[Append Rows]
C --> B
C --> D[No New Rows]
D --> E[Final Result]
B --> F[Depth Guard]
B --> G[Cycle Prevention]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F,G allNodes
class B highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| No termination logic | runaway query | add depth/path guards |
| Cycles in data | repeated expansions | track visited IDs |
| Unindexed relationship key | slow recursion | add index on parent_id/manager_id |
| Returning different column sets | query errors | keep anchor and step aligned |
| Overusing recursion for sequences | unnecessary complexity | use generate_series() |
Quick Reference
WITH RECURSIVE x AS (SELECT ... UNION ALL SELECT ... FROM t JOIN x ON ...) SELECT * FROM x;
WITH RECURSIVE x AS (...) SELECT * FROM x WHERE depth < 10;
WITH RECURSIVE x AS (...) SELECT * FROM x WHERE NOT id = ANY(path);
WITH RECURSIVE nums(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM nums WHERE n < 10) SELECT * FROM nums;
WITH RECURSIVE tree AS (...) SELECT * FROM tree ORDER BY depth;
What's Next
- Previous: Common Table Expressions (CTEs) - Review the previous lesson to reinforce context.
- Next: Temporary Tables - Continue to the next concept with incremental complexity.
- Module Overview - Return to this module index and choose another related lesson.