Skip to main content

Recursive CTE Patterns

Learning Focus

Use this lesson to understand Recursive CTE Patterns with practical syntax and examples.

Concept Overview

Recursive CTEs (WITH RECURSIVE) let you iterate in SQL: you start from an anchor set of rows and repeatedly expand the result by joining back to a base table until no new rows are produced.

Common use cases:

  • org charts (manager -> reportees)
  • category trees (parent -> children)
  • dependency chains (job -> downstream jobs)
  • graph walks (limited depth)

Why is it important?

  • Hierarchies are common: many schemas use adjacency lists (parent_id) inside a single table
  • Correctness: recursion makes the termination condition explicit (depth, visited nodes)
  • Performance: recursive queries can be efficient when the relationship key is indexed

Where does it fit?

Recursive CTEs build on normal CTEs. They are part of advanced SQL and connect to self joins, indexing, and window functions (for ordering and ranking results).


Syntax & Rules

Core Syntax

WITH RECURSIVE cte_name AS (
-- 1) Anchor term (starting rows)
SELECT ...

UNION ALL

-- 2) Recursive term (references cte_name)
SELECT ...
FROM base_table
JOIN cte_name ON ...
WHERE ...
)
SELECT *
FROM cte_name;

Available Options / Parameters

ComponentPurposeTypical pattern
Anchor termstarting rowsWHERE parent_id IS NULL
Recursive termexpands from previous leveljoin base table to the CTE
UNION ALLappends new rowsduplicates can be controlled via visited set
Terminationstops expansiondepth guard, visited guard, or no matches
Path / depthdiagnosticsdepth + 1, `path

Key Rules and Considerations

  • Anchor and recursive terms must return the same columns with compatible types.
  • Recursion stops when the recursive term produces zero rows.
  • Always think about termination: depth limits and cycle prevention.
  • Index the relationship column (for example, manager_id, parent_id).

Step-by-Step Examples

Example 1: Traverse a Manager Hierarchy (Beginner)

CREATE TABLE staff (
staff_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
full_name text NOT NULL,
manager_id bigint REFERENCES staff(staff_id)
);

INSERT INTO staff (full_name, manager_id) VALUES
('CEO', NULL),
('VP Eng', 1),
('Eng Manager', 2),
('Engineer 1', 3),
('Engineer 2', 3),
('VP Sales', 1);

-- All descendants under VP Eng (staff_id = 2)
WITH RECURSIVE reports AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE staff_id = 2

UNION ALL

SELECT s.staff_id, s.full_name, s.manager_id, r.depth + 1
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
)
SELECT staff_id, full_name, depth
FROM reports
ORDER BY depth, staff_id;

Expected output:

 staff_id | full_name     | depth
----------+---------------+------
2 | VP Eng | 0
3 | Eng Manager | 1
4 | Engineer 1 | 2
5 | Engineer 2 | 2
(4 rows)

Explanation:

  • The anchor selects the starting node (VP Eng).
  • The recursive term finds direct reportees of the previous level.

Example 2: Add a Path for Debugging and Display (Intermediate)

WITH RECURSIVE reports AS (
SELECT
staff_id,
full_name,
manager_id,
0 AS depth,
ARRAY[staff_id] AS path
FROM staff
WHERE staff_id = 2

UNION ALL

SELECT
s.staff_id,
s.full_name,
s.manager_id,
r.depth + 1,
r.path || s.staff_id
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
)
SELECT staff_id, full_name, depth, path
FROM reports
ORDER BY depth, staff_id;

Expected output:

 staff_id | full_name     | depth | path
----------+---------------+-------+-----------
2 | VP Eng | 0 | {2}
3 | Eng Manager | 1 | {2,3}
4 | Engineer 1 | 2 | {2,3,4}
5 | Engineer 2 | 2 | {2,3,5}
(4 rows)

Explanation:

  • The path array is useful for debugging and for cycle prevention.

Example 3: Cycle Prevention (Advanced)

Cycles can exist in bad data (A manages B, B manages A). You can prevent infinite-like recursion by tracking visited IDs.

-- Introduce a cycle (DO NOT do this in real schemas)
UPDATE staff SET manager_id = 5 WHERE staff_id = 2;

WITH RECURSIVE walk AS (
SELECT
staff_id,
manager_id,
ARRAY[staff_id] AS path
FROM staff
WHERE staff_id = 2

UNION ALL

SELECT
s.staff_id,
s.manager_id,
w.path || s.staff_id
FROM staff s
JOIN walk w ON s.manager_id = w.staff_id
WHERE NOT s.staff_id = ANY(w.path)
)
SELECT staff_id, manager_id, path
FROM walk;

Expected output (shape):

 staff_id | manager_id | path
----------+------------+---------
2 | 5 | {2}
3 | 2 | {2,3}
4 | 3 | {2,3,4}
5 | 3 | {2,3,5}
(4 rows)

Explanation:

  • The WHERE NOT s.staff_id = ANY(w.path) check blocks revisiting nodes.

Example 4: Depth Guard (Advanced)

Depth guards prevent unexpected huge expansions.

WITH RECURSIVE reports AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE staff_id = 1

UNION ALL

SELECT s.staff_id, s.full_name, s.manager_id, r.depth + 1
FROM staff s
JOIN reports r ON s.manager_id = r.staff_id
WHERE r.depth < 5
)
SELECT staff_id, full_name, depth
FROM reports
ORDER BY depth, staff_id;

Expected outcome:

  • Stops expanding after 5 levels even if the tree is deeper.

Practical Use Cases

1) Category trees

WITH RECURSIVE subtree AS (
SELECT category_id, parent_category_id
FROM categories
WHERE category_id = 42
UNION ALL
SELECT c.category_id, c.parent_category_id
FROM categories c
JOIN subtree s ON c.parent_category_id = s.category_id
)
SELECT * FROM subtree;

2) Dependency chains

WITH RECURSIVE deps AS (
SELECT job_id, depends_on_job_id
FROM job_dependencies
WHERE job_id = 100
UNION ALL
SELECT d.job_id, d.depends_on_job_id
FROM job_dependencies d
JOIN deps x ON d.job_id = x.depends_on_job_id
)
SELECT * FROM deps;

3) Generate sequences iteratively (usually better with generate_series)

WITH RECURSIVE nums(n) AS (
SELECT 1
UNION ALL
SELECT n + 1 FROM nums WHERE n < 10
)
SELECT n FROM nums;

4) Find broken parent references

SELECT c.category_id
FROM categories c
LEFT JOIN categories p ON p.category_id = c.parent_category_id
WHERE c.parent_category_id IS NOT NULL
AND p.category_id IS NULL;

5) Flatten nested structures for reporting

WITH RECURSIVE org AS (
SELECT staff_id, full_name, manager_id, 0 AS depth
FROM staff
WHERE manager_id IS NULL
UNION ALL
SELECT s.staff_id, s.full_name, s.manager_id, o.depth + 1
FROM staff s
JOIN org o ON s.manager_id = o.staff_id
)
SELECT * FROM org;

Common Mistakes & Troubleshooting

1) Anchor and recursive term return different columns

Wrong SQL:

WITH RECURSIVE x AS (
SELECT staff_id FROM staff
UNION ALL
SELECT staff_id, manager_id FROM staff
)
SELECT * FROM x;

Typical error:

ERROR:  each UNION query must have the same number of columns

Fix:

  • Ensure both terms return the same number of columns with compatible types.

2) No termination condition

Bad outcome:

  • Very large outputs or long runtime.

Fix:

  • Ensure the recursive join naturally stops producing new rows.
  • Add depth guards or cycle checks if needed.

3) Cycles in data

Bad outcome:

  • Recursion repeats paths.

Fix:

  • Track visited IDs (path arrays) and block revisits.

4) Slow recursion on large tables

Bad outcome:

  • Each recursive step becomes expensive.

Fix:

  • Index the relationship column (for example, manager_id).
  • Keep the recursive step narrow (select only needed columns).

Debugging checklist:

  1. Run the anchor query alone and verify the starting rows.
  2. Run the recursive term joined to a small sample of previous rows.
  3. Add a depth column and verify it increments.
  4. Add path and cycle prevention if data quality is uncertain.
  5. Use EXPLAIN and ensure the relationship key is indexed.

Best Practices

  • ✅ Index parent/relationship columns used in recursion. ❌ Avoid recursive traversal on unindexed adjacency lists.
  • ✅ Keep the recursive term lightweight. ❌ Avoid selecting wide payloads at every level.
  • ✅ Add cycle prevention when data is not guaranteed acyclic. ❌ Avoid trusting data integrity without constraints.
  • ✅ Add a depth guard in operational queries. ❌ Avoid unbounded recursion in production paths.
  • ✅ Prefer generate_series() for numeric/date sequences. ❌ Avoid recursive CTEs for simple sequence generation.

Hands-On Practice

Use this setup for the exercises:

CREATE TABLE practice_categories (
category_id bigint PRIMARY KEY,
parent_category_id bigint,
name text NOT NULL
);

INSERT INTO practice_categories (category_id, parent_category_id, name) VALUES
(1, NULL, 'Root'),
(2, 1, 'A'),
(3, 1, 'B'),
(4, 2, 'A1');

Exercise 1 (Easy): Traverse from root

Task: List all categories starting from category_id = 1 with depth.

-- Your SQL here

Solution:

WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, 0 AS depth
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.depth + 1
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
)
SELECT category_id, name, depth
FROM tree
ORDER BY depth, category_id;

Exercise 2 (Medium): Add a path

Task: Add a path array to the recursion.

-- Your SQL here

Solution:

WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, 0 AS depth, ARRAY[category_id] AS path
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.depth + 1, t.path || c.category_id
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
)
SELECT category_id, name, depth, path
FROM tree
ORDER BY depth, category_id;

Exercise 3 (Advanced): Prevent cycles

Task: Add a cycle prevention predicate that blocks revisiting category IDs.

-- Your SQL here

Solution:

WITH RECURSIVE tree AS (
SELECT category_id, parent_category_id, name, ARRAY[category_id] AS path
FROM practice_categories
WHERE category_id = 1
UNION ALL
SELECT c.category_id, c.parent_category_id, c.name, t.path || c.category_id
FROM practice_categories c
JOIN tree t ON c.parent_category_id = t.category_id
WHERE NOT c.category_id = ANY(t.path)
)
SELECT category_id, name, path
FROM tree;

Connection to Other Concepts

ConceptWhy it matters
Common Table Expressions (CTEs)recursive CTEs extend the WITH idea
SELF JOINrecursion is repeated self-joins across levels
Indexesrelationship key indexes determine performance
Window functionsuseful for ordering/ranking flattened hierarchies
Constraintsenforcing FK relationships helps prevent broken trees

Visual Learning Diagram

flowchart TD
A[Anchor Rows] --> B[Recursive Step]
B --> C[Append Rows]
C --> B
C --> D[No New Rows]
D --> E[Final Result]
B --> F[Depth Guard]
B --> G[Cycle Prevention]

classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F,G allNodes
class B highlight

Common Pitfalls

PitfallConsequencePrevention
No termination logicrunaway queryadd depth/path guards
Cycles in datarepeated expansionstrack visited IDs
Unindexed relationship keyslow recursionadd index on parent_id/manager_id
Returning different column setsquery errorskeep anchor and step aligned
Overusing recursion for sequencesunnecessary complexityuse generate_series()

Quick Reference

WITH RECURSIVE x AS (SELECT ... UNION ALL SELECT ... FROM t JOIN x ON ...) SELECT * FROM x;
WITH RECURSIVE x AS (...) SELECT * FROM x WHERE depth < 10;
WITH RECURSIVE x AS (...) SELECT * FROM x WHERE NOT id = ANY(path);
WITH RECURSIVE nums(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM nums WHERE n < 10) SELECT * FROM nums;
WITH RECURSIVE tree AS (...) SELECT * FROM tree ORDER BY depth;

What's Next