Skip to main content

GROUP BY with ROLLUP / GROUPING SETS

Learning Focus

Use this lesson to understand GROUP BY with ROLLUP / GROUPING SETS with practical syntax and examples.

Concept Overview

When you build reporting queries, you often want subtotals and grand totals alongside the “leaf” groups. PostgreSQL supports advanced grouping constructs that let you generate these totals in a single query:

  • ROLLUP (...) - hierarchical subtotals in a specific order
  • GROUPING SETS (...) - explicit control over which grouping levels to produce
  • CUBE (...) - all combinations of grouping columns (use carefully)

Why is it important?

  • Reporting output: subtotal rows reduce the need for multiple queries
  • Correctness: totals are computed from the same filtered dataset as the leaf rows
  • Maintainability: grouping sets keep “levels” explicit and avoid copy/paste unions

Where does it fit?

These constructs extend GROUP BY. They are usually paired with aggregate functions, and often with GROUPING() to label subtotal/total rows without confusing them with real NULL data.


Syntax & Rules

Core Syntax

-- ROLLUP: leaf -> subtotal -> grand total
SELECT a, b, SUM(v)
FROM t
GROUP BY ROLLUP (a, b);

-- GROUPING SETS: explicit levels
SELECT a, b, SUM(v)
FROM t
GROUP BY GROUPING SETS ((a, b), (a), ());

-- CUBE: all combinations
SELECT a, b, SUM(v)
FROM t
GROUP BY CUBE (a, b);

Available Options / Parameters

ConstructWhat it producesNotes
ROLLUP(a, b)(a,b), (a), ()hierarchical totals in order
GROUPING SETS(...)exactly the sets you listbest for custom report levels
CUBE(a, b)(a,b), (a), (b), ()grows quickly as you add columns
GROUPING(col)1 if col is “rolled up”use to label subtotal rows

Key Rules and Considerations

  • Subtotal rows often contain NULL in grouping columns. That NULL can mean “subtotal”, not “real NULL data”.
  • Use GROUPING(col) to tell the difference.
  • Sort order is important for report readability.
  • Filter with WHERE before grouping so every total uses the same population.

Step-by-Step Examples

Example 1: Subtotals and Grand Total with ROLLUP (Beginner)

CREATE TABLE sales (
sale_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
region text NOT NULL,
product_line text NOT NULL,
revenue numeric(12,2) NOT NULL CHECK (revenue >= 0)
);

INSERT INTO sales (region, product_line, revenue) VALUES
('EU', 'Hardware', 45000.00),
('EU', 'Software', 61000.00),
('US', 'Hardware', 72000.00),
('US', 'Software', 34000.00);

SELECT
region,
product_line,
SUM(revenue) AS revenue
FROM sales
GROUP BY ROLLUP (region, product_line)
ORDER BY region NULLS LAST, product_line NULLS LAST;

Expected output:

 region | product_line | revenue
--------+--------------+----------
EU | Hardware | 45000.00
EU | Software | 61000.00
EU | | 106000.00
US | Hardware | 72000.00
US | Software | 34000.00
US | | 106000.00
| | 212000.00
(7 rows)

Explanation:

  • (region, product_line) rows are leaf groups.
  • (region, NULL) rows are subtotals per region.
  • (NULL, NULL) is the grand total.

Example 2: Label Totals Safely with GROUPING() (Intermediate)

If region or product_line could be NULL in real data, you must not treat NULL as automatically meaning “subtotal”. Use GROUPING() to detect rollup rows.

SELECT
CASE
WHEN GROUPING(region) = 1 THEN 'ALL_REGIONS'
ELSE region
END AS region,
CASE
WHEN GROUPING(product_line) = 1 AND GROUPING(region) = 0 THEN 'REGION_SUBTOTAL'
WHEN GROUPING(product_line) = 1 AND GROUPING(region) = 1 THEN 'GRAND_TOTAL'
ELSE product_line
END AS product_line,
SUM(revenue) AS revenue
FROM sales
GROUP BY ROLLUP (region, product_line)
ORDER BY
GROUPING(region), region,
GROUPING(product_line), product_line;

Expected output:

 region       | product_line     | revenue
--------------+------------------+----------
EU | Hardware | 45000.00
EU | Software | 61000.00
EU | REGION_SUBTOTAL | 106000.00
US | Hardware | 72000.00
US | Software | 34000.00
US | REGION_SUBTOTAL | 106000.00
ALL_REGIONS | GRAND_TOTAL | 212000.00
(7 rows)

Explanation:

  • GROUPING(col) = 1 means the column value is introduced by rollup (subtotal/total), not original data.

Example 3: Custom Levels with GROUPING SETS (Advanced)

Suppose you want:

  • revenue per (region, product_line)
  • revenue per region
  • revenue per product_line
  • grand total

GROUPING SETS makes this explicit.

SELECT
region,
product_line,
SUM(revenue) AS revenue
FROM sales
GROUP BY GROUPING SETS (
(region, product_line),
(region),
(product_line),
()
)
ORDER BY region NULLS LAST, product_line NULLS LAST;

Expected output (conceptual shape):

 region | product_line | revenue
--------+--------------+----------
EU | Hardware | 45000.00
EU | Software | 61000.00
EU | | 106000.00
US | Hardware | 72000.00
US | Software | 34000.00
US | | 106000.00
| Hardware | 117000.00
| Software | 95000.00
| | 212000.00
(9 rows)

Explanation:

  • (product_line) totals appear as (NULL, product_line).
  • The grand total appears as (NULL, NULL).

Example 4: CUBE Produces All Combinations (Advanced)

SELECT
region,
product_line,
SUM(revenue) AS revenue
FROM sales
GROUP BY CUBE (region, product_line)
ORDER BY region NULLS LAST, product_line NULLS LAST;

Expected outcome:

  • For two columns, CUBE produces the same combinations as Example 3.
  • For three or more columns, the number of rows grows quickly.

Practical Use Cases

1) Financial reports with subtotals

SELECT department, expense_type, SUM(amount)
FROM expenses
WHERE occurred_at >= date_trunc('month', now())
GROUP BY ROLLUP (department, expense_type);

2) Multi-level sales summaries

SELECT region, product_line, SUM(revenue)
FROM sales
GROUP BY GROUPING SETS ((region, product_line), (region), ());

3) Dashboard totals in one query

SELECT
COUNT(*) AS total_orders,
COUNT(*) FILTER (WHERE status = 'paid') AS paid_orders
FROM orders;

4) Explicit labeling for BI tools

SELECT
CASE WHEN GROUPING(region) = 1 THEN 'TOTAL' ELSE region END AS region,
SUM(revenue) AS revenue
FROM sales
GROUP BY ROLLUP (region);

5) Reusable report view

CREATE VIEW sales_rollup AS
SELECT region, product_line, SUM(revenue) AS revenue
FROM sales
GROUP BY ROLLUP (region, product_line);

Common Mistakes & Troubleshooting

1) Treating subtotal NULLs as real NULL data

Wrong approach:

  • Assuming region IS NULL always means “grand total”.

Bad outcome:

  • If your real data can contain NULL region, reports become ambiguous.

Fix:

SELECT region, GROUPING(region) AS is_total, SUM(revenue)
FROM sales
GROUP BY ROLLUP (region);

2) Forgetting to control sort order

Wrong approach:

SELECT region, product_line, SUM(revenue)
FROM sales
GROUP BY ROLLUP (region, product_line);

Bad outcome:

  • Totals can appear in unexpected places in the output.

Fix:

ORDER BY region NULLS LAST, product_line NULLS LAST;

3) Using CUBE without realizing the explosion

Wrong approach:

GROUP BY CUBE (a, b, c, d)

Bad outcome:

  • Many grouping combinations and a very large result.

Fix:

  • Prefer GROUPING SETS with explicit levels.

4) Filtering totals incorrectly

Wrong approach:

  • Filtering subtotal rows in application code only.

Fix:

  • Label rows with GROUPING() in SQL so downstream systems can handle them consistently.

Debugging checklist:

  1. Decide which subtotal levels you need.
  2. Use ROLLUP for hierarchical subtotals; use GROUPING SETS for explicit lists.
  3. Add GROUPING() columns while developing to confirm which rows are totals.
  4. Add explicit ordering to keep totals in a predictable place.
  5. Validate totals by comparing with a separate “grand total only” query.

Best Practices

  • ✅ Use GROUPING() to label subtotal/grand total rows. ❌ Avoid assuming NULL means “total”.
  • ✅ Prefer GROUPING SETS when you need custom levels. ❌ Avoid CUBE when the number of combinations is not intentional.
  • ✅ Keep filters in WHERE so totals match leaf rows. ❌ Avoid filtering some branches differently with multiple queries.
  • ✅ Add deterministic ordering for report output. ❌ Avoid letting subtotal rows appear in random positions.
  • ✅ Document the meaning of each subtotal row in the report. ❌ Avoid shipping ambiguous NULL subtotal rows to BI tools.

Hands-On Practice

Use this setup for the exercises:

CREATE TABLE practice_sales (
region text NOT NULL,
product_line text NOT NULL,
revenue numeric(12,2) NOT NULL
);

INSERT INTO practice_sales (region, product_line, revenue) VALUES
('EU', 'Hardware', 10.00),
('EU', 'Software', 20.00),
('US', 'Hardware', 30.00);

Exercise 1 (Easy): ROLLUP totals

Task: Return revenue by (region, product_line) plus subtotals and a grand total.

-- Your SQL here

Solution:

SELECT region, product_line, SUM(revenue) AS revenue
FROM practice_sales
GROUP BY ROLLUP (region, product_line)
ORDER BY region NULLS LAST, product_line NULLS LAST;

Exercise 2 (Medium): Label subtotal rows

Task: Label region subtotals and grand total using GROUPING().

-- Your SQL here

Solution:

SELECT
CASE WHEN GROUPING(region) = 1 THEN 'ALL' ELSE region END AS region,
CASE
WHEN GROUPING(product_line) = 1 AND GROUPING(region) = 0 THEN 'REGION_SUBTOTAL'
WHEN GROUPING(product_line) = 1 AND GROUPING(region) = 1 THEN 'GRAND_TOTAL'
ELSE product_line
END AS product_line,
SUM(revenue) AS revenue
FROM practice_sales
GROUP BY ROLLUP (region, product_line)
ORDER BY GROUPING(region), region, GROUPING(product_line), product_line;

Exercise 3 (Advanced): Custom levels with GROUPING SETS

Task: Produce leaf groups, region totals, product_line totals, and grand total.

-- Your SQL here

Solution:

SELECT region, product_line, SUM(revenue) AS revenue
FROM practice_sales
GROUP BY GROUPING SETS (
(region, product_line),
(region),
(product_line),
()
)
ORDER BY region NULLS LAST, product_line NULLS LAST;

Connection to Other Concepts

ConceptWhy it matters
GROUP BYROLLUP and grouping sets extend grouping
Aggregate functionstotals are computed with SUM, COUNT, etc
HAVINGcan filter groups even when using rollups
NULL Valuessubtotal rows often include NULLs
ORDER BYcontrols report row ordering
Conditional aggregates (FILTER)often used alongside grouped totals

Visual Learning Diagram

flowchart TD
A[Rows] --> B[WHERE]
B --> C[GROUP BY]
C --> D[ROLLUP]
C --> E[GROUPING SETS]
C --> F[CUBE]
D --> G[Subtotals + Grand Total]
E --> H[Explicit Levels]
F --> I[All Combinations]
G --> J[GROUPING() Labels]

classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F,G,H,I,J allNodes
class D highlight

Common Pitfalls

PitfallConsequencePrevention
Treating NULL subtotal rows as real NULL dataconfusing reportsuse GROUPING() to detect totals
Using CUBE with many columnshuge outputsprefer explicit GROUPING SETS
No explicit orderingtotals appear in odd placesadd a report-friendly ORDER BY
Inconsistent filteringtotals disagree with leaf rowskeep filters in WHERE
Shipping unlabeled totals to BI toolsdownstream confusionlabel totals with CASE + GROUPING()

Quick Reference

SELECT a, SUM(v) FROM t GROUP BY ROLLUP (a);
SELECT a, b, SUM(v) FROM t GROUP BY ROLLUP (a, b);
SELECT a, b, SUM(v) FROM t GROUP BY GROUPING SETS ((a,b),(a),());
SELECT GROUPING(a) AS is_total_a, SUM(v) FROM t GROUP BY ROLLUP (a);
SELECT a, b, SUM(v) FROM t GROUP BY CUBE (a, b);

What's Next