UNION ALL
Use this lesson to understand UNION ALL with practical syntax and examples.
Concept Overview
UNION ALL is a set operator that stacks rows vertically by combining the results of multiple SELECT statements.
UNION ALLkeeps duplicates (fast).UNIONremoves duplicates (slower because it must deduplicate).
This is different from joins:
- Joins combine related tables horizontally (more columns per row).
- Unions combine compatible result sets vertically (more rows).
Why is it important?
- Reporting: combine data from multiple partitions/tables (monthly tables, regional tables)
- Log analysis: append multiple sources while keeping all entries
- Performance: when you do not need deduplication,
UNION ALLavoids sorting/hashing
Where does it fit?
UNION ALL is part of SQL set operators (along with UNION, and in PostgreSQL also INTERSECT and EXCEPT). You typically use it with SELECT queries and sometimes inside views.
Syntax & Rules
Core Syntax
SELECT col1, col2
FROM t1
UNION ALL
SELECT col1, col2
FROM t2;
Available Options / Parameters
| Rule | What it means | Notes |
|---|---|---|
| Same column count | each SELECT must return same number of columns | otherwise error |
| Type compatibility | corresponding columns must be castable to a common type | explicit casts often help |
| Column names | output column names come from the first SELECT | aliases in later SELECTs do not rename outputs |
| ORDER BY placement | ORDER BY applies to the full union | only once, at the end |
| LIMIT placement | LIMIT applies to the full union | use parentheses/CTEs if you need per-branch limits |
Key Rules and Considerations
- Use
UNION ALLby default unless you explicitly need deduplication. - If you need to tag which branch a row came from, add a literal column (for example,
'2024' AS source_year). - Always define ordering at the end if you need stable output.
Step-by-Step Examples
Example 1: Combine Two Tables (Duplicates Kept) (Beginner)
CREATE TABLE customers_usa (
customer_id bigint PRIMARY KEY,
full_name text NOT NULL
);
CREATE TABLE customers_canada (
customer_id bigint PRIMARY KEY,
full_name text NOT NULL
);
INSERT INTO customers_usa (customer_id, full_name) VALUES
(1, 'Alice'),
(2, 'Bob');
INSERT INTO customers_canada (customer_id, full_name) VALUES
(3, 'Claire'),
(4, 'Bob');
SELECT full_name
FROM customers_usa
UNION ALL
SELECT full_name
FROM customers_canada
ORDER BY full_name;
Expected output:
full_name
----------
Alice
Bob
Bob
Claire
(4 rows)
Explanation:
- Bob appears twice because duplicates are not removed.
Example 2: Add a Source Column (Intermediate)
SELECT full_name, 'USA' AS country
FROM customers_usa
UNION ALL
SELECT full_name, 'Canada' AS country
FROM customers_canada
ORDER BY full_name, country;
Expected output:
full_name | country
----------+---------
Alice | USA
Bob | Canada
Bob | USA
Claire | Canada
(4 rows)
Explanation:
- Adding a literal column is the simplest way to label provenance.
Example 3: UNION vs UNION ALL (Deduplication) (Intermediate)
SELECT full_name
FROM customers_usa
UNION
SELECT full_name
FROM customers_canada
ORDER BY full_name;
Expected output:
full_name
----------
Alice
Bob
Claire
(3 rows)
Explanation:
UNIONdeduplicates, so Bob appears once.
Example 4: ORDER BY and LIMIT Apply to the Whole Union (Advanced)
SELECT full_name
FROM customers_usa
UNION ALL
SELECT full_name
FROM customers_canada
ORDER BY full_name
LIMIT 2;
Expected output:
full_name
----------
Alice
Bob
(2 rows)
Explanation:
- The
ORDER BYandLIMITapply to the combined output.
Example 5: Type Alignment with CAST (Advanced)
CREATE TABLE events_2025 (
event_id bigint PRIMARY KEY,
occurred_on date NOT NULL
);
CREATE TABLE events_2026 (
event_id bigint PRIMARY KEY,
occurred_at timestamptz NOT NULL
);
INSERT INTO events_2025 (event_id, occurred_on) VALUES
(1, '2025-12-31');
INSERT INTO events_2026 (event_id, occurred_at) VALUES
(2, '2026-01-01 00:00:00+00');
-- Align both branches to timestamptz
SELECT event_id, occurred_on::timestamptz AS occurred_at
FROM events_2025
UNION ALL
SELECT event_id, occurred_at
FROM events_2026
ORDER BY event_id;
Expected output:
event_id | occurred_at
----------+------------------------
1 | 2025-12-31 00:00:00+00
2 | 2026-01-01 00:00:00+00
(2 rows)
Explanation:
- Explicit casting makes type compatibility clear and prevents surprises.
Practical Use Cases
1) Merge partitioned tables for reporting
SELECT * FROM sales_2025
UNION ALL
SELECT * FROM sales_2026;
2) Combine logs from multiple systems
SELECT occurred_at, message, 'api' AS source
FROM api_logs
UNION ALL
SELECT occurred_at, message, 'worker' AS source
FROM worker_logs;
3) Build a multi-region customer view
SELECT customer_id, full_name, 'US' AS region FROM customers_us
UNION ALL
SELECT customer_id, full_name, 'EU' AS region FROM customers_eu;
4) Append staging data for validation
SELECT * FROM canonical_orders
UNION ALL
SELECT * FROM staging_orders;
5) Compare UNION ALL and UNION costs
-- Use UNION only when dedup is required.
SELECT email FROM mailing_list_a
UNION
SELECT email FROM mailing_list_b;
Common Mistakes & Troubleshooting
1) Mismatched column counts
Wrong SQL:
SELECT customer_id, full_name
FROM customers_usa
UNION ALL
SELECT full_name
FROM customers_canada;
Typical error:
ERROR: each UNION query must have the same number of columns
Fix:
SELECT customer_id, full_name
FROM customers_usa
UNION ALL
SELECT customer_id, full_name
FROM customers_canada;
2) Type mismatch across branches
Wrong SQL:
SELECT occurred_on
FROM events_2025
UNION ALL
SELECT occurred_at
FROM events_2026;
Bad outcome:
- PostgreSQL may error or pick an unexpected common type.
Fix:
SELECT occurred_on::timestamptz AS occurred_at FROM events_2025
UNION ALL
SELECT occurred_at FROM events_2026;
3) Putting ORDER BY in the middle
Wrong SQL:
SELECT full_name FROM customers_usa ORDER BY full_name
UNION ALL
SELECT full_name FROM customers_canada;
Typical error:
ERROR: syntax error at or near "UNION"
Fix:
SELECT full_name FROM customers_usa
UNION ALL
SELECT full_name FROM customers_canada
ORDER BY full_name;
4) Expecting UNION ALL to remove duplicates
Wrong expectation:
- assuming duplicates are removed automatically.
Fix:
- Use
UNIONwhen you need deduplication.
Debugging checklist:
- Verify every SELECT has the same number of columns.
- Check corresponding column types and add explicit casts.
- Ensure
ORDER BYappears only once at the end. - Decide explicitly between
UNION ALL(keep duplicates) andUNION(dedupe). - If performance is slow, check whether
UNIONis forcing a large dedupe operation.
Best Practices
- ✅ Use
UNION ALLwhen duplicates are acceptable. ❌ AvoidUNIONby default; dedupe has a cost. - ✅ Add provenance columns (
'source' AS origin) when merging data. ❌ Avoid ambiguous combined outputs with no source context. - ✅ Align types explicitly with casts. ❌ Avoid relying on implicit coercions across branches.
- ✅ Apply
ORDER BYandLIMITonce at the end. ❌ Avoid trying to order individual branches without parentheses/CTEs. - ✅ Consider a view for repeated union logic. ❌ Avoid copy-pasting large unions across many code paths.
Hands-On Practice
Use this setup for the exercises:
CREATE TABLE employees_2023 (employee_id integer, full_name text);
CREATE TABLE employees_2024 (employee_id integer, full_name text);
INSERT INTO employees_2023 VALUES (1, 'Alice'), (2, 'Bob');
INSERT INTO employees_2024 VALUES (3, 'Charlie'), (2, 'Bob');
Exercise 1 (Easy): Combine all names
Task: Combine employee names from both years using UNION ALL.
-- Your SQL here
Solution:
SELECT full_name FROM employees_2023
UNION ALL
SELECT full_name FROM employees_2024;
Exercise 2 (Medium): Add a source year column
Task: Return full_name and year for both tables.
-- Your SQL here
Solution:
SELECT full_name, '2023' AS year
FROM employees_2023
UNION ALL
SELECT full_name, '2024' AS year
FROM employees_2024
ORDER BY full_name, year;
Exercise 3 (Advanced): Deduplicate with UNION
Task: Return distinct names across both years.
-- Your SQL here
Solution:
SELECT full_name FROM employees_2023
UNION
SELECT full_name FROM employees_2024
ORDER BY full_name;
Connection to Other Concepts
| Concept | Why it matters |
|---|---|
UNION | same shape as UNION ALL but deduplicates |
JOIN | combines columns horizontally (different from union) |
ORDER BY | applies to the full union result set |
| Data types | unions require compatible types across branches |
| Views | often used to encapsulate repeated union logic |
Visual Learning Diagram
flowchart TD
A[SELECT 1] --> B[UNION ALL]
C[SELECT 2] --> B
B --> D[Rows Stacked]
B --> E[Duplicates Kept]
B --> F[ORDER BY at End]
B --> G[Type Alignment]
E --> H[Fast Path]
B --> I[UNION (Dedup)]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F,G,H,I allNodes
class B highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Different column counts | query error | ensure each branch returns same number of columns |
| Incompatible types | errors or unexpected casts | cast explicitly to a common type |
| ORDER BY not at the end | syntax error | place ORDER BY after the final SELECT |
| Using UNION when you don't need dedupe | slower queries | use UNION ALL by default |
| Losing provenance | hard to debug results | add a source column |
Quick Reference
SELECT a FROM t1 UNION ALL SELECT a FROM t2;
SELECT a FROM t1 UNION SELECT a FROM t2; -- dedupe
SELECT a, 'src1' AS src FROM t1 UNION ALL SELECT a, 'src2' FROM t2;
SELECT a::text FROM t1 UNION ALL SELECT b::text FROM t2; -- align types
SELECT * FROM (SELECT ... UNION ALL SELECT ...) u ORDER BY ... LIMIT ...;
What's Next
- Previous: SELF JOIN - Review the previous lesson to reinforce context.
- Module Overview - Return to this module index and choose another related lesson.