Recurring Maintenance Jobs
Use this lesson to design recurring PostgreSQL maintenance jobs that are safe to rerun, observable, and low-impact: keep work bounded, prefer constraints/partition drops over massive deletes, and schedule appropriately.
Concept Overview
Recurring maintenance jobs keep PostgreSQL healthy and predictable over time:
- Statistics freshness:
ANALYZEkeeps the planner accurate. - Table/index health:
VACUUMhelps reuse space and supports index-only scans. - Precomputed reporting: refresh materialized views.
- Retention: remove/archival of old rows (often better via partitioning).
Why is it important?
- Performance stability: fewer plan regressions as data changes.
- Predictable storage: avoid unbounded table growth.
- Reduced incident risk: proactive cleanup beats emergency cleanup.
- Operational clarity: a scheduled set of known jobs is easier to review than ad hoc scripts.
Where does it fit?
Maintenance jobs are part of day-2 operations:
- use
pg_cronfor SQL-only jobs inside PostgreSQL - use external schedulers for orchestration (backups, exports, notifications)
Syntax & Rules
Core Maintenance Commands
ANALYZE public.orders;
VACUUM (ANALYZE) public.orders;
REFRESH MATERIALIZED VIEW daily_revenue;
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_revenue;
REINDEX INDEX CONCURRENTLY idx_orders_customer_id;
Scheduling via pg_cron
SELECT cron.schedule('job-name', '0 2 * * *', $$ANALYZE public.orders$$);
Observability Pattern (Recommended)
Create a log table you can query:
CREATE TABLE maintenance_log (
log_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
job_name text NOT NULL,
ran_at timestamptz NOT NULL DEFAULT now(),
rows_affected bigint,
notes text
);
Available Options / Parameters
| Job type | Common SQL | Notes |
|---|---|---|
| Stats refresh | ANALYZE t; | cheap and high value for changing distributions |
| Vacuum | VACUUM (ANALYZE) t; | cannot run inside a transaction block |
| MV refresh | REFRESH MATERIALIZED VIEW ... | CONCURRENTLY requires a unique index |
| Retention | DELETE ... WHERE ... | batch deletes; consider partitioning |
| Index rebuild | REINDEX CONCURRENTLY ... | cannot run inside a transaction block |
Key Rules and Considerations
- Keep jobs bounded (limit work per run) to avoid long locks.
- Make jobs idempotent (safe to rerun after retries/restarts).
- Use off-peak windows for heavier tasks.
- Prefer partition dropping to massive deletes when retention is the goal.
Step-by-Step Examples
Example 1: A Bounded Retention Cleanup with Logging (Intermediate)
Setup:
DROP TABLE IF EXISTS audit_log;
DROP TABLE IF EXISTS maintenance_log;
CREATE TABLE audit_log (
audit_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
changed_at timestamptz NOT NULL,
payload text NOT NULL
);
CREATE TABLE maintenance_log (
log_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
job_name text NOT NULL,
ran_at timestamptz NOT NULL DEFAULT now(),
rows_affected bigint,
notes text
);
INSERT INTO audit_log (changed_at, payload) VALUES
(now() - interval '120 days', 'old'),
(now() - interval '10 days', 'new');
Run a bounded delete (example limit 1000) and log the result:
WITH deleted AS (
DELETE FROM audit_log
WHERE changed_at < now() - interval '90 days'
RETURNING 1
)
INSERT INTO maintenance_log (job_name, rows_affected, notes)
SELECT 'cleanup-audit-log', COUNT(*), 'bounded delete'
FROM deleted;
SELECT job_name, rows_affected
FROM maintenance_log
ORDER BY log_id;
Expected output (example):
DROP TABLE
DROP TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 2
INSERT 0 1
job_name | rows_affected
-----------------+--------------
cleanup-audit-log| 1
(1 row)
Explanation:
DELETE ... RETURNINGlets you count what you removed.- Logging makes the job observable without external tooling.
Example 2: Idempotent Daily Rollup with ON CONFLICT (Advanced)
Setup:
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS daily_revenue_summary;
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
status text NOT NULL,
created_at timestamptz NOT NULL,
amount numeric(12,2) NOT NULL CHECK (amount >= 0)
);
CREATE TABLE daily_revenue_summary (
day date PRIMARY KEY,
paid_orders bigint NOT NULL,
paid_revenue numeric(12,2) NOT NULL,
computed_at timestamptz NOT NULL DEFAULT now()
);
INSERT INTO orders (status, created_at, amount) VALUES
('paid', TIMESTAMPTZ '2026-03-04 10:00+00', 10.00),
('paid', TIMESTAMPTZ '2026-03-04 11:00+00', 25.00),
('failed', TIMESTAMPTZ '2026-03-04 12:00+00', 5.00);
Compute “yesterday” deterministically (here we use a fixed date for the example):
INSERT INTO daily_revenue_summary (day, paid_orders, paid_revenue)
SELECT
DATE '2026-03-04' AS day,
COUNT(*) FILTER (WHERE status = 'paid') AS paid_orders,
COALESCE(SUM(amount) FILTER (WHERE status = 'paid'), 0) AS paid_revenue
FROM orders
WHERE created_at >= TIMESTAMPTZ '2026-03-04 00:00+00'
AND created_at < TIMESTAMPTZ '2026-03-05 00:00+00'
ON CONFLICT (day)
DO UPDATE SET
paid_orders = EXCLUDED.paid_orders,
paid_revenue = EXCLUDED.paid_revenue,
computed_at = now();
SELECT day, paid_orders, paid_revenue
FROM daily_revenue_summary;
Expected output:
INSERT 0 1
day | paid_orders | paid_revenue
------------+------------+-------------
2026-03-04 | 2 | 35.00
(1 row)
Explanation:
- This is idempotent: rerunning produces the same row values, not duplicates.
- In a scheduled job, replace fixed dates with
current_date - 1and a UTC-safe range.
Example 3: Refresh a Materialized View (and Common CONCURRENTLY Requirement) (Intermediate)
Setup:
DROP MATERIALIZED VIEW IF EXISTS mv_paid_orders;
CREATE MATERIALIZED VIEW mv_paid_orders AS
SELECT order_id, created_at, amount
FROM orders
WHERE status = 'paid';
Refresh:
REFRESH MATERIALIZED VIEW mv_paid_orders;
Expected output:
DROP MATERIALIZED VIEW
SELECT 2
REFRESH MATERIALIZED VIEW
Explanation:
REFRESH MATERIALIZED VIEWrecomputes the view.- If you need
CONCURRENTLY, you must create a unique index on the materialized view first.
Practical Use Cases
1) Daily ANALYZE for High-Churn Tables
SELECT cron.schedule('nightly-analyze-orders', '0 1 * * *', $$ANALYZE public.orders$$);
2) Weekly Vacuum for Heavily Updated Tables
SELECT cron.schedule('weekly-vacuum-orders', '0 2 * * 0', $$VACUUM (ANALYZE) public.orders$$);
3) Materialized View Refresh for Dashboards
SELECT cron.schedule('refresh-paid-orders-mv', '*/15 * * * *', $$REFRESH MATERIALIZED VIEW mv_paid_orders$$);
4) Retention Cleanup for Audit Tables
SELECT cron.schedule('cleanup-audit-log', '0 4 * * *', $$DELETE FROM audit_log WHERE changed_at < now() - interval '90 days'$$);
5) Partition Drop Instead of Delete (At Scale)
-- If audit_log is partitioned by day/month, dropping old partitions is fast and avoids bloat.
-- ALTER TABLE audit_log DETACH PARTITION audit_log_2025_12;
-- DROP TABLE audit_log_2025_12;
Common Mistakes & Troubleshooting
Mistake 1: Running VACUUM in a Transaction Block
Wrong:
BEGIN;
VACUUM (ANALYZE) public.orders;
COMMIT;
Typical error:
ERROR: VACUUM cannot run inside a transaction block
Fix: run VACUUM as a standalone statement (which pg_cron naturally does for a single command).
Mistake 2: Massive Deletes in One Run
Wrong:
DELETE FROM audit_log WHERE changed_at < now() - interval '90 days';
What happens: long locks, bloat, slow autovacuum.
Fix options:
-- Option A: batch deletes
WITH batch AS (
DELETE FROM audit_log
WHERE changed_at < now() - interval '90 days'
RETURNING 1
)
SELECT COUNT(*) FROM batch;
-- Option B: partition and drop partitions
Mistake 3: REFRESH MATERIALIZED VIEW CONCURRENTLY Without a Unique Index
Wrong:
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_paid_orders;
Typical error:
ERROR: cannot refresh materialized view "mv_paid_orders" concurrently
HINT: Create a unique index with no WHERE clause on one or more columns of the materialized view.
Fix:
CREATE UNIQUE INDEX mv_paid_orders_order_id_uniq
ON mv_paid_orders (order_id);
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_paid_orders;
Debugging tips:
- Run the maintenance SQL manually first; then schedule it.
- If a job is slow, measure with
EXPLAIN (ANALYZE, BUFFERS)(where applicable) on staging. - Check indexes needed for validation/refresh patterns (unique index for concurrent refresh).
- Keep work bounded and schedule off-peak.
Best Practices
- ✅ Log what each job did (rows affected, runtime); ❌ run silent jobs you can’t verify.
- ✅ Keep work bounded per run; ❌ delete/refresh everything in one giant job.
- ✅ Prefer partition-drop for retention at scale; ❌ rely on huge deletes.
- ✅ Use idempotent rollups (
ON CONFLICT DO UPDATE); ❌ insert-only rollups. - ✅ Schedule heavy work off-peak; ❌ run maintenance during peak traffic.
Hands-On Practice
Exercise 1 (Easy): Write an idempotent rollup
Task: Write an upsert into daily_revenue_summary for one day.
-- Your SQL here
Solution:
INSERT INTO daily_revenue_summary (day, paid_orders, paid_revenue)
VALUES (current_date - 1, 0, 0)
ON CONFLICT (day)
DO UPDATE SET
paid_orders = EXCLUDED.paid_orders,
paid_revenue = EXCLUDED.paid_revenue,
computed_at = now();
Exercise 2 (Medium): Log a bounded delete
Task: Delete old rows (conceptually bounded) and log how many were removed.
-- Your SQL here
Solution:
WITH deleted AS (
DELETE FROM audit_log
WHERE changed_at < now() - interval '90 days'
RETURNING 1
)
INSERT INTO maintenance_log (job_name, rows_affected)
SELECT 'practice-retention', COUNT(*)
FROM deleted;
Exercise 3 (Advanced): Prepare a concurrent MV refresh
Task: Create the unique index required for REFRESH MATERIALIZED VIEW CONCURRENTLY.
-- Your SQL here
Solution:
CREATE UNIQUE INDEX mv_paid_orders_order_id_uniq
ON mv_paid_orders (order_id);
Connection to Other Concepts
| Concept | Why it matters |
|---|---|
| pg_cron basics | this lesson applies the scheduling mechanics to real maintenance |
| Performance optimization | stats/vacuum/index maintenance directly affects performance |
| Transactions and concurrency | maintenance jobs take locks; bounded work reduces contention |
| Views/materialized views | refresh strategy impacts dashboard freshness |
| Backup and recovery | retention and maintenance should align with recovery goals |
Visual Learning Diagram
flowchart TD
A[Schedule job] --> B[Run SQL]
B --> C[Bound work\n(batches)]
B --> D[Idempotent writes\n(ON CONFLICT)]
B --> E[Maintenance ops\nANALYZE/VACUUM/REFRESH]
C --> F[Log outcome]
D --> F
E --> F
F --> G[Monitor + alert]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,C,D,E,F,G allNodes
class F highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Unbounded retention deletes | bloat and long locks | batch deletes or partition drop |
| No unique index for concurrent MV refresh | refresh fails | create a unique index |
| Heavy maintenance at peak | user latency spikes | schedule off-peak |
| No logging | you can’t prove success | log + inspect run details |
| Treating calendar months as 30 days | wrong retention windows | choose intervals carefully |
Quick Reference
ANALYZE public.t;
VACUUM (ANALYZE) public.t;
REFRESH MATERIALIZED VIEW mv;
CREATE UNIQUE INDEX ON mv (col);
DELETE FROM t WHERE ts < now() - interval '90 days';
What's Next
- Previous: pg_cron Basics - Scheduling mechanics and job/run inspection.
- Next: Transactions and Concurrency - Understand locks, isolation, and how jobs affect concurrency.
- Module Overview - Return to Scheduling and Automation overview.