SELECT Statement
Use this lesson to understand SELECT Statement with practical syntax and examples.
Concept Overview
What is the SELECT statement?
SELECT is the fundamental read operation in PostgreSQL. It retrieves rows from one or more tables (or other query sources) and returns them as a result set.
In practice, SELECT is how you power application screens, build reports, validate migrations, and debug production issues by inspecting data.
Why is it important?
- Correctness: the right filters, joins, and projections prevent incorrect reports and wrong application behavior
- Performance: selecting only needed columns and using index-friendly predicates reduces I/O and improves latency
- Maintainability: readable column lists and aliases make queries safer to evolve as schemas change
Where does it fit?
SELECT is part of DQL (Data Query Language). It is read-only (it does not modify rows) and serves as the base for more advanced concepts:
- filtering (
WHERE), sorting (ORDER BY), pagination (LIMIT/OFFSET) - grouping and aggregation (
GROUP BY,HAVING) - joins, views, subqueries, and window functions
Syntax & Rules
Core Syntax Structure
SELECT [DISTINCT]
select_list
FROM from_item
[WHERE condition]
[GROUP BY group_list]
[HAVING condition]
[ORDER BY sort_list]
[LIMIT count [OFFSET offset]];
Available Clauses / Parameters
| Clause | Purpose | Typical use |
|---|---|---|
SELECT | choose columns and expressions | projection, computed columns |
FROM | choose source table(s) | tables, views, subqueries |
WHERE | filter rows | only active rows, time windows |
ORDER BY | sort results | stable pagination, reports |
LIMIT / OFFSET | return a subset | list pages and previews |
DISTINCT | remove duplicates | unique categories, IDs |
Key Rules and Considerations
- Row order is undefined unless you specify
ORDER BY. - Prefer explicit column lists for stability and performance; reserve
SELECT *for exploration. - PostgreSQL folds unquoted identifiers to lowercase. Quoted identifiers become case-sensitive.
WHEREcannot reference a SELECT alias (the alias is computed later). Use a subquery or CTE.- For large tables, design indexes that match your filter and sort patterns.
Step-by-Step Examples
Example 1: Basic SELECT (Beginner)
Create a table, insert sample rows, and query the result.
-- Create a sample table
CREATE TABLE employees (
employee_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
full_name text NOT NULL,
department text NOT NULL,
salary numeric(12,2) NOT NULL CHECK (salary >= 0),
hire_date date NOT NULL,
active boolean NOT NULL DEFAULT true
);
-- Insert sample data
INSERT INTO employees (full_name, department, salary, hire_date, active) VALUES
('John Smith', 'IT', 75000.00, '2020-01-15', true),
('Sarah Johnson','Marketing', 65000.00, '2019-03-22', true),
('Mike Davis', 'IT', 80000.00, '2018-07-10', true),
('Emily Brown', 'HR', 60000.00, '2021-05-18', false),
('David Wilson', 'Finance', 70000.00, '2020-11-30', true);
-- Query: select all columns
SELECT *
FROM employees
ORDER BY employee_id;
Expected output:
+-------------+--------------+------------+----------+------------+--------+
| employee_id | full_name | department | salary | hire_date | active |
+-------------+--------------+------------+----------+------------+--------+
| 1 | John Smith | IT | 75000.00 | 2020-01-15 | t |
| 2 | Sarah Johnson | Marketing | 65000.00 | 2019-03-22 | t |
| 3 | Mike Davis | IT | 80000.00 | 2018-07-10 | t |
| 4 | Emily Brown | HR | 60000.00 | 2021-05-18 | f |
| 5 | David Wilson | Finance | 70000.00 | 2020-11-30 | t |
+-------------+--------------+------------+----------+------------+--------+
Explanation:
SELECT *returns every column.ORDER BY employee_idproduces deterministic output.
Example 2: Selecting Specific Columns (Beginner)
Select only the columns required by the caller.
-- Query: select only the columns you need
SELECT employee_id, full_name, department
FROM employees
ORDER BY employee_id;
Expected output:
+-------------+--------------+------------+
| employee_id | full_name | department |
+-------------+--------------+------------+
| 1 | John Smith | IT |
| 2 | Sarah Johnson | Marketing |
| 3 | Mike Davis | IT |
| 4 | Emily Brown | HR |
| 5 | David Wilson | Finance |
+-------------+--------------+------------+
Explanation:
- Narrow projection reduces memory, I/O, and network transfer.
- Explicit columns are resilient against schema changes.
Example 3: Aliases and Computed Columns (Intermediate)
Rename output columns and compute derived values.
-- Query: readable output with derived values
SELECT
full_name AS employee_name,
department,
salary AS annual_salary,
round(salary / 12, 2) AS monthly_salary,
round(salary * 1.10, 2) AS salary_with_10_percent_raise
FROM employees
ORDER BY annual_salary DESC;
Expected output:
+---------------+------------+--------------+----------------+----------------------------+
| employee_name | department | annual_salary | monthly_salary | salary_with_10_percent_raise |
+---------------+------------+--------------+----------------+----------------------------+
| Mike Davis | IT | 80000.00 | 6666.67 | 88000.00 |
| John Smith | IT | 75000.00 | 6250.00 | 82500.00 |
| David Wilson | Finance | 70000.00 | 5833.33 | 77000.00 |
| Sarah Johnson | Marketing | 65000.00 | 5416.67 | 71500.00 |
| Emily Brown | HR | 60000.00 | 5000.00 | 66000.00 |
+---------------+------------+--------------+----------------+----------------------------+
Explanation:
AS employee_nameimproves output readability.round(...)produces stable, report-friendly decimals.
Example 4: Filtering, Sorting, and Limiting (Intermediate)
Filter to a subset, sort deterministically, and return only the top rows.
-- Query: active employees in IT or Finance, highest salary first
SELECT employee_id, full_name, department, salary
FROM employees
WHERE active = true
AND department IN ('IT', 'Finance')
ORDER BY salary DESC, employee_id ASC
LIMIT 3;
Expected output:
+-------------+-------------+------------+----------+
| employee_id | full_name | department | salary |
+-------------+-------------+------------+----------+
| 3 | Mike Davis | IT | 80000.00 |
| 1 | John Smith | IT | 75000.00 |
| 5 | David Wilson | Finance | 70000.00 |
+-------------+-------------+------------+----------+
Explanation:
WHEREreduces rows before sorting.- Adding
employee_idas a tie-breaker stabilizes ordering. LIMIT 3prevents returning unnecessary rows.
Example 5: Production-Ready SELECT Pattern (Advanced)
Design a query shape that works well in production: stable ordering, index-friendly filters, and realistic constraints.
-- Production-style table
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
status text NOT NULL CHECK (status IN ('pending', 'paid', 'cancelled')),
amount numeric(12,2) NOT NULL CHECK (amount >= 0),
created_at timestamptz NOT NULL DEFAULT now()
);
-- Index to support dashboard pattern: newest orders per customer
CREATE INDEX idx_orders_customer_created_at
ON orders (customer_id, created_at DESC);
-- Seed sample rows (for demonstration)
INSERT INTO orders (customer_id, status, amount, created_at) VALUES
(42, 'paid', 199.99, '2026-03-05 10:15:00+00'),
(42, 'paid', 49.00, '2026-03-04 18:20:00+00'),
(42, 'paid', 15.50, '2026-03-02 09:01:00+00'),
(42, 'paid', 120.00, '2026-02-28 12:33:00+00'),
(42, 'paid', 9.99, '2026-02-27 02:10:00+00');
-- Query: newest paid orders for a customer
SELECT order_id, amount, created_at
FROM orders
WHERE customer_id = 42
AND status = 'paid'
ORDER BY created_at DESC
LIMIT 5;
Expected output:
+----------+--------+---------------------------+
| order_id | amount | created_at |
+----------+--------+---------------------------+
| 1 | 199.99 | 2026-03-05 10:15:00+00 |
| 2 | 49.00 | 2026-03-04 18:20:00+00 |
| 3 | 15.50 | 2026-03-02 09:01:00+00 |
| 4 | 120.00 | 2026-02-28 12:33:00+00 |
| 5 | 9.99 | 2026-02-27 02:10:00+00 |
+----------+--------+---------------------------+
Explanation:
- The index matches the filter and sort pattern.
- The projection is narrow and stable.
- CHECK constraints protect data quality.
Practical Use Cases
Use Case 1: Employee Directory (HR)
Business problem: Render a directory page with stable ordering and minimal columns.
SELECT employee_id, full_name, department
FROM employees
WHERE active = true
ORDER BY department, full_name;
Use Case 2: Debug One Record (Support)
Business problem: Quickly inspect a specific order for a customer.
SELECT order_id, customer_id, status, amount, created_at
FROM orders
WHERE order_id = 12345;
Use Case 3: Daily KPI Extract (Analytics)
Business problem: Produce daily order counts for the last 30 days.
SELECT date_trunc('day', created_at) AS day_key, COUNT(*) AS orders
FROM orders
WHERE created_at >= now() - interval '30 days'
GROUP BY 1
ORDER BY day_key;
Use Case 4: Data Quality Check (Pre-Constraint)
Business problem: Find rows missing a required value before enforcing NOT NULL.
SELECT customer_id
FROM customers
WHERE email IS NULL
LIMIT 20;
Use Case 5: Safe Preview Before a Write
Business problem: Preview the exact rows you intend to update.
SELECT order_id, status
FROM orders
WHERE status = 'pending'
ORDER BY created_at ASC
LIMIT 10;
Common Mistakes & Troubleshooting
Mistake 1: Missing FROM
Wrong SQL:
SELECT full_name, department;
Error message: ERROR: SELECT requires a FROM clause (or a syntax error depending on context)
Fix:
SELECT full_name, department
FROM employees;
Mistake 2: Relation does not exist
Wrong SQL:
SELECT * FROM employee;
Error message: ERROR: relation "employee" does not exist
Fix:
SELECT * FROM employees;
Mistake 3: SELECT alias referenced in WHERE
Wrong SQL:
SELECT salary / 12 AS monthly_salary
FROM employees
WHERE monthly_salary > 6000;
Bad outcome: ERROR: column "monthly_salary" does not exist
Fix (subquery):
SELECT *
FROM (
SELECT employee_id, full_name, salary / 12 AS monthly_salary
FROM employees
) s
WHERE s.monthly_salary > 6000;
Mistake 4: Assuming results are ordered
Wrong assumption:
SELECT employee_id, full_name
FROM employees;
Bad outcome: unstable pagination and inconsistent UI results.
Fix:
SELECT employee_id, full_name
FROM employees
ORDER BY employee_id;
Debugging Tips
- Start with
SELECT ... LIMIT 10to validate row shape and data. - Verify tables with
\dtand columns with\d table_nameinpsql. - Add one clause at a time:
WHERE->ORDER BY->LIMIT. - If performance is slow, inspect the plan with
EXPLAIN(module 18).
Best Practices
✅ Always select only the columns you need; avoid SELECT * in production.
✅ Always add ORDER BY when order matters (especially for pagination).
✅ Always include a stable tie-breaker in ORDER BY (primary key).
✅ Always keep filters index-friendly; avoid wrapping columns in functions unless you have expression indexes.
✅ Always use meaningful aliases for computed columns and report outputs.
❌ Avoid quoted identifiers unless required; they create case-sensitive object names.
❌ Avoid large OFFSET pagination on big tables; prefer keyset pagination.
❌ Avoid mixing AND/OR without parentheses; make boolean intent explicit.
Hands-On Practice
Exercise 1: Basic Selection (Easy)
Task: Select full_name and department for all employees.
-- Write your query here
Solution:
SELECT full_name, department
FROM employees
ORDER BY full_name;
Exercise 2: Computed Columns (Medium)
Task: Return full_name, salary, and round(salary / 12, 2) AS monthly_salary.
-- Write your query here
Solution:
SELECT
full_name,
salary,
round(salary / 12, 2) AS monthly_salary
FROM employees
ORDER BY monthly_salary DESC;
Exercise 3: Stable Ordering (Advanced)
Task: Return the first 3 active employees ordered by newest hire date, with a stable tie-breaker.
-- Write your query here
Solution:
SELECT employee_id, full_name, hire_date
FROM employees
WHERE active = true
ORDER BY hire_date DESC, employee_id DESC
LIMIT 3;
Connection to Other Concepts
| Related concept | Why it matters |
|---|---|
| PostgreSQL Data Types | Correct types affect comparisons, ordering, and index usage |
| SELECT DISTINCT | Removes duplicates without grouping |
| WHERE Clause | Filters rows before returning results |
| ORDER BY | Defines deterministic ordering and stable pagination |
| Joins | Most real queries select across multiple tables |
Visual Learning Diagram
graph TB
A[Data Types] --> B[Table Structure]
B --> C[SELECT Statement]
C --> D[WHERE Clause]
C --> E[ORDER BY]
C --> F[LIMIT]
D --> G[Advanced Filtering]
E --> H[Stable Pagination]
C --> I[JOINs]
classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5
class A,B,D,E,F,G,H,I allNodes
class C highlight
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
Using SELECT * for hot application queries | Wider rows and extra I/O | Select only needed columns |
| Missing deterministic ordering | Duplicate/missing rows in pagination | Use ORDER BY + unique tie-breaker |
| Filtering on function-wrapped columns | Index not used; slow scans | Rewrite predicate or add expression index |
| Quoted identifiers introduced accidentally | Case-sensitive object names | Prefer lowercase names; avoid quotes |
| Over-fetching rows | Slow endpoints and large transfers | Use WHERE + LIMIT patterns |
Quick Reference
SELECT col1, col2 FROM t;
SELECT col1 AS alias FROM t;
SELECT col, round(col / 12, 2) AS monthly FROM t;
SELECT * FROM t WHERE condition ORDER BY id DESC LIMIT 10;
SELECT col1, col2 FROM t ORDER BY col1 DESC, id DESC LIMIT 10;
What's Next
- Next: SELECT DISTINCT - Continue to the next concept with incremental complexity.
- Module Overview - Return to this module index and choose another related lesson.