Primary and Foreign Keys
Use this lesson to design primary keys and foreign keys in PostgreSQL, choose safe referential actions, and avoid performance traps (especially missing FK indexes).
Concept Overview
Keys define identity and relationships:
- A primary key (PK) uniquely identifies each row in a table.
- A foreign key (FK) enforces that a column (or set of columns) references an existing row in another table.
In PostgreSQL, a primary key is implemented as a UNIQUE + NOT NULL constraint (and it creates a unique B-tree index).
Important PostgreSQL behavior: creating a foreign key does not automatically create an index on the referencing column(s). If you join on the FK, or frequently delete/update parent rows, you typically want an index on the FK column.
Why is it important?
- Correctness: stop orphaned rows and duplicated identities
- Data modeling: represent one-to-many and many-to-many relationships
- Performance: good keys + supporting indexes make joins and deletes predictable
- Operational safety: enforce rules in the database, not only in app code
Where does it fit?
Keys are foundational schema tools used in:
CREATE TABLEdesigns (define PK/FKs up front)- migrations (
ALTER TABLE ... ADD CONSTRAINT) - indexing strategy (PK index + often FK indexes)
- data cleanup and audits (find duplicates/orphans)
Syntax & Rules
Core Syntax
Primary key (inline):
CREATE TABLE customers (
customer_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email text NOT NULL UNIQUE
);
Primary key (table constraint):
CREATE TABLE customers (
customer_id bigint GENERATED ALWAYS AS IDENTITY,
email text NOT NULL,
CONSTRAINT customers_pkey PRIMARY KEY (customer_id)
);
Foreign key (table constraint with a name):
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
amount numeric(12,2) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
);
Available Options / Parameters
| Feature | Example | Meaning / Notes |
|---|---|---|
| PK columns | PRIMARY KEY (a, b) | composite key across multiple columns |
| FK target | REFERENCES customers(customer_id) | referenced table + columns must be PK/unique |
| Named constraint | CONSTRAINT fk_orders_customer ... | makes errors and migrations easier to manage |
ON DELETE | ON DELETE RESTRICT / CASCADE / SET NULL | action when parent row is deleted |
ON UPDATE | ON UPDATE CASCADE | action when referenced key changes (rare if PK stable) |
| Deferrable | DEFERRABLE INITIALLY DEFERRED | check at commit time (bulk loads/complex transactions) |
| Validation control | NOT VALID + VALIDATE CONSTRAINT | add constraints without immediate full validation |
Key Rules and Considerations
- FK columns and referenced columns must have matching types.
- The referenced columns must have a
PRIMARY KEYorUNIQUEconstraint. - PostgreSQL creates indexes automatically for
PRIMARY KEYandUNIQUEconstraints. - PostgreSQL does not automatically index FK columns; add indexes intentionally.
- Prefer stable PKs (surrogate identity or immutable natural key) to avoid cascading updates.
Step-by-Step Examples
Use this setup for the examples:
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS customers;
CREATE TABLE customers (
customer_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email text NOT NULL UNIQUE
);
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
status text NOT NULL,
amount numeric(12,2) NOT NULL CHECK (amount >= 0),
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE RESTRICT
);
INSERT INTO customers (email) VALUES
('a@example.com'),
('b@example.com');
INSERT INTO orders (customer_id, status, amount) VALUES
(1, 'paid', 10.00),
(1, 'paid', 25.00),
(2, 'failed', 5.00);
Example 1: Primary Key Uniqueness (Beginner)
SELECT customer_id, email
FROM customers
ORDER BY customer_id;
Expected output:
customer_id | email
-------------+---------------
1 | a@example.com
2 | b@example.com
(2 rows)
Explanation:
- Each row gets a unique
customer_idvia the identity column.
Example 2: Foreign Key Prevents Orphans (Beginner)
Try inserting an order for a non-existent customer:
INSERT INTO orders (customer_id, status, amount)
VALUES (999, 'paid', 20.00);
Expected outcome:
ERROR: insert or update on table "orders" violates foreign key constraint "fk_orders_customer"
DETAIL: Key (customer_id)=(999) is not present in table "customers".
Fix: insert the parent row first (or use a valid id).
Example 3: FK Indexing Matters (Intermediate)
Querying by FK is common:
EXPLAIN
SELECT order_id, amount
FROM orders
WHERE customer_id = 1;
Expected output (example):
Seq Scan on orders (cost=0.00..1.05 rows=1 width=16)
Filter: (customer_id = 1)
Add an index on the FK column:
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
EXPLAIN
SELECT order_id, amount
FROM orders
WHERE customer_id = 1;
Expected output (example):
Index Scan using idx_orders_customer_id on orders (cost=0.13..8.15 rows=1 width=16)
Index Cond: (customer_id = 1)
Explanation:
- Without the index, PostgreSQL may scan the whole table.
- With the index, it can use an index scan for selective lookups.
Example 4: Referential Actions and Safety (Advanced)
With ON DELETE RESTRICT, deleting a parent with child rows fails:
DELETE FROM customers WHERE customer_id = 1;
Expected outcome:
ERROR: update or delete on table "customers" violates foreign key constraint "fk_orders_customer" on table "orders"
DETAIL: Key (customer_id)=(1) is still referenced from table "orders".
If your business rule is “delete customer deletes orders”, you can choose CASCADE:
ALTER TABLE orders
DROP CONSTRAINT fk_orders_customer;
ALTER TABLE orders
ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE CASCADE;
Expected output:
ALTER TABLE
ALTER TABLE
Explanation:
CASCADEis powerful and risky. Use it only when you truly want cascading deletes.
Practical Use Cases
1) Many-to-Many With a Join Table
Context: users can have many roles; roles can belong to many users.
CREATE TABLE roles (
role_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text NOT NULL UNIQUE
);
CREATE TABLE user_roles (
user_id bigint NOT NULL REFERENCES users(user_id),
role_id bigint NOT NULL REFERENCES roles(role_id),
PRIMARY KEY (user_id, role_id)
);
2) Prevent Deletes That Would Orphan Critical Records
Context: keep invoices immutable; don't allow deleting customers with invoices.
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE RESTRICT
3) Optional Relationships With SET NULL
Context: a ticket may optionally reference an assignee.
CREATE TABLE tickets (
ticket_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
assignee_id bigint REFERENCES users(user_id) ON DELETE SET NULL,
title text NOT NULL
);
4) Add Constraints Safely on Large Existing Tables
Context: introduce a foreign key without blocking for a full scan immediately.
ALTER TABLE orders
ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
NOT VALID;
ALTER TABLE orders
VALIDATE CONSTRAINT fk_orders_customer;
Common Mistakes & Troubleshooting
Mistake 1: Forgetting to Index the FK Column
Wrong assumption: the FK constraint creates an index on orders.customer_id.
What happens: slow joins and slow deletes/updates on parent rows.
Fix:
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
Mistake 2: Type Mismatch Between PK and FK
Wrong:
-- customers.customer_id is bigint
-- orders.customer_id is integer
What happens: the FK may fail to create or force casts that hurt performance.
Fix: match types exactly.
Mistake 3: Using CASCADE Casually
What happens: deletes can wipe large parts of your data graph.
Fix: default to RESTRICT/NO ACTION and document any cascades.
Debugging tips:
- When an FK insert fails, read the
DETAILline to find the missing parent key. - For delete failures, identify child rows:
SELECT * FROM orders WHERE customer_id = 1;. - Use
\d+ table_nameinpsqlto inspect constraints. - For performance issues,
EXPLAINthe join/filter and add indexes to match predicates.
Best Practices
- ✅ Use stable primary keys; ❌ choose a PK that changes frequently.
- ✅ Name constraints (
fk_orders_customer); ❌ rely on auto-generated names. - ✅ Index foreign key columns used in joins/filters; ❌ assume PostgreSQL creates FK indexes automatically.
- ✅ Prefer
RESTRICT/NO ACTIONunless you truly need cascade; ❌ useCASCADEas a shortcut. - ✅ Validate large-table constraints safely with
NOT VALID+VALIDATE; ❌ add heavyweight constraints without a rollout plan.
Hands-On Practice
Use this setup for the exercises:
DROP TABLE IF EXISTS practice_orders;
DROP TABLE IF EXISTS practice_customers;
CREATE TABLE practice_customers (
customer_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email text NOT NULL UNIQUE
);
INSERT INTO practice_customers (email) VALUES ('p1@example.com'), ('p2@example.com');
Exercise 1 (Easy): Create a table with a primary key
Task: Create practice_orders with order_id as an identity primary key.
-- Your SQL here
Solution:
CREATE TABLE practice_orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY
);
Exercise 2 (Medium): Add a foreign key
Task: Add customer_id bigint NOT NULL to practice_orders and create a foreign key to practice_customers(customer_id).
-- Your SQL here
Solution:
ALTER TABLE practice_orders
ADD COLUMN customer_id bigint NOT NULL;
ALTER TABLE practice_orders
ADD CONSTRAINT fk_practice_orders_customer
FOREIGN KEY (customer_id)
REFERENCES practice_customers(customer_id);
Exercise 3 (Advanced): Add an index for the FK
Task: Create an index that speeds up joins and deletes for the FK relationship.
-- Your SQL here
Solution:
CREATE INDEX idx_practice_orders_customer_id
ON practice_orders (customer_id);
Connection to Other Concepts
| Concept | Why it matters |
|---|---|
CREATE TABLE | keys are usually defined at table creation |
ALTER TABLE | add/validate constraints during migrations |
| Index strategy | FK indexes are a common performance requirement |
| Transactions | deferrable constraints can change when checks happen |
| Joins | PK/FK relationships drive common join patterns |
Visual Learning Diagram
flowchart LR
A[customers\nPK: customer_id] -->|FK orders.customer_id| B[orders\nFK: customer_id]
B --> C[Index on orders.customer_id\n(recommended)]
A --> D[ON DELETE action\nRESTRICT/CASCADE/SET NULL]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D allNodes
class B highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| No index on FK columns | slow joins, slow parent deletes/updates | create INDEX ON child(fk) |
| Cascade deletes everywhere | accidental mass deletions | default to restrict/no action |
| Mismatched types | failed constraints, poor plans | match PK/FK types exactly |
| Unstable PK values | cascading updates, complexity | keep PK immutable |
| Anonymous constraint names | hard debugging | name constraints consistently |
Quick Reference
PRIMARY KEY (id)
FOREIGN KEY (parent_id) REFERENCES parent(id)
CREATE INDEX ON child(parent_id)
ALTER TABLE t ADD CONSTRAINT fk_name FOREIGN KEY (c) REFERENCES p(id)
ALTER TABLE t VALIDATE CONSTRAINT fk_name
What's Next
- Next: Column and Table Constraints - Add
NOT NULL,UNIQUE,CHECK, and validation patterns. - Module Overview - Return to the Constraints and Indexes index.