Skip to main content

SELECT Statement

Learning Focus

Use this lesson to understand SELECT Statement with practical syntax and examples.

Concept Overview

What is the SELECT statement?

SELECT is the fundamental read operation in PostgreSQL. It retrieves rows from one or more tables (or other query sources) and returns them as a result set.

In practice, SELECT is how you power application screens, build reports, validate migrations, and debug production issues by inspecting data.

Why is it important?

  • Correctness: the right filters, joins, and projections prevent incorrect reports and wrong application behavior
  • Performance: selecting only needed columns and using index-friendly predicates reduces I/O and improves latency
  • Maintainability: readable column lists and aliases make queries safer to evolve as schemas change

Where does it fit?

SELECT is part of DQL (Data Query Language). It is read-only (it does not modify rows) and serves as the base for more advanced concepts:

  • filtering (WHERE), sorting (ORDER BY), pagination (LIMIT/OFFSET)
  • grouping and aggregation (GROUP BY, HAVING)
  • joins, views, subqueries, and window functions

Syntax & Rules

Core Syntax Structure

SELECT [DISTINCT]
select_list
FROM from_item
[WHERE condition]
[GROUP BY group_list]
[HAVING condition]
[ORDER BY sort_list]
[LIMIT count [OFFSET offset]];

Available Clauses / Parameters

ClausePurposeTypical use
SELECTchoose columns and expressionsprojection, computed columns
FROMchoose source table(s)tables, views, subqueries
WHEREfilter rowsonly active rows, time windows
ORDER BYsort resultsstable pagination, reports
LIMIT / OFFSETreturn a subsetlist pages and previews
DISTINCTremove duplicatesunique categories, IDs

Key Rules and Considerations

  • Row order is undefined unless you specify ORDER BY.
  • Prefer explicit column lists for stability and performance; reserve SELECT * for exploration.
  • PostgreSQL folds unquoted identifiers to lowercase. Quoted identifiers become case-sensitive.
  • WHERE cannot reference a SELECT alias (the alias is computed later). Use a subquery or CTE.
  • For large tables, design indexes that match your filter and sort patterns.

Step-by-Step Examples

Example 1: Basic SELECT (Beginner)

Create a table, insert sample rows, and query the result.

-- Create a sample table
CREATE TABLE employees (
employee_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
full_name text NOT NULL,
department text NOT NULL,
salary numeric(12,2) NOT NULL CHECK (salary >= 0),
hire_date date NOT NULL,
active boolean NOT NULL DEFAULT true
);

-- Insert sample data
INSERT INTO employees (full_name, department, salary, hire_date, active) VALUES
('John Smith', 'IT', 75000.00, '2020-01-15', true),
('Sarah Johnson','Marketing', 65000.00, '2019-03-22', true),
('Mike Davis', 'IT', 80000.00, '2018-07-10', true),
('Emily Brown', 'HR', 60000.00, '2021-05-18', false),
('David Wilson', 'Finance', 70000.00, '2020-11-30', true);

-- Query: select all columns
SELECT *
FROM employees
ORDER BY employee_id;

Expected output:

+-------------+--------------+------------+----------+------------+--------+
| employee_id | full_name | department | salary | hire_date | active |
+-------------+--------------+------------+----------+------------+--------+
| 1 | John Smith | IT | 75000.00 | 2020-01-15 | t |
| 2 | Sarah Johnson | Marketing | 65000.00 | 2019-03-22 | t |
| 3 | Mike Davis | IT | 80000.00 | 2018-07-10 | t |
| 4 | Emily Brown | HR | 60000.00 | 2021-05-18 | f |
| 5 | David Wilson | Finance | 70000.00 | 2020-11-30 | t |
+-------------+--------------+------------+----------+------------+--------+

Explanation:

  • SELECT * returns every column.
  • ORDER BY employee_id produces deterministic output.

Example 2: Selecting Specific Columns (Beginner)

Select only the columns required by the caller.

-- Query: select only the columns you need
SELECT employee_id, full_name, department
FROM employees
ORDER BY employee_id;

Expected output:

+-------------+--------------+------------+
| employee_id | full_name | department |
+-------------+--------------+------------+
| 1 | John Smith | IT |
| 2 | Sarah Johnson | Marketing |
| 3 | Mike Davis | IT |
| 4 | Emily Brown | HR |
| 5 | David Wilson | Finance |
+-------------+--------------+------------+

Explanation:

  • Narrow projection reduces memory, I/O, and network transfer.
  • Explicit columns are resilient against schema changes.

Example 3: Aliases and Computed Columns (Intermediate)

Rename output columns and compute derived values.

-- Query: readable output with derived values
SELECT
full_name AS employee_name,
department,
salary AS annual_salary,
round(salary / 12, 2) AS monthly_salary,
round(salary * 1.10, 2) AS salary_with_10_percent_raise
FROM employees
ORDER BY annual_salary DESC;

Expected output:

+---------------+------------+--------------+----------------+----------------------------+
| employee_name | department | annual_salary | monthly_salary | salary_with_10_percent_raise |
+---------------+------------+--------------+----------------+----------------------------+
| Mike Davis | IT | 80000.00 | 6666.67 | 88000.00 |
| John Smith | IT | 75000.00 | 6250.00 | 82500.00 |
| David Wilson | Finance | 70000.00 | 5833.33 | 77000.00 |
| Sarah Johnson | Marketing | 65000.00 | 5416.67 | 71500.00 |
| Emily Brown | HR | 60000.00 | 5000.00 | 66000.00 |
+---------------+------------+--------------+----------------+----------------------------+

Explanation:

  • AS employee_name improves output readability.
  • round(...) produces stable, report-friendly decimals.

Example 4: Filtering, Sorting, and Limiting (Intermediate)

Filter to a subset, sort deterministically, and return only the top rows.

-- Query: active employees in IT or Finance, highest salary first
SELECT employee_id, full_name, department, salary
FROM employees
WHERE active = true
AND department IN ('IT', 'Finance')
ORDER BY salary DESC, employee_id ASC
LIMIT 3;

Expected output:

+-------------+-------------+------------+----------+
| employee_id | full_name | department | salary |
+-------------+-------------+------------+----------+
| 3 | Mike Davis | IT | 80000.00 |
| 1 | John Smith | IT | 75000.00 |
| 5 | David Wilson | Finance | 70000.00 |
+-------------+-------------+------------+----------+

Explanation:

  • WHERE reduces rows before sorting.
  • Adding employee_id as a tie-breaker stabilizes ordering.
  • LIMIT 3 prevents returning unnecessary rows.

Example 5: Production-Ready SELECT Pattern (Advanced)

Design a query shape that works well in production: stable ordering, index-friendly filters, and realistic constraints.

-- Production-style table
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
status text NOT NULL CHECK (status IN ('pending', 'paid', 'cancelled')),
amount numeric(12,2) NOT NULL CHECK (amount >= 0),
created_at timestamptz NOT NULL DEFAULT now()
);

-- Index to support dashboard pattern: newest orders per customer
CREATE INDEX idx_orders_customer_created_at
ON orders (customer_id, created_at DESC);

-- Seed sample rows (for demonstration)
INSERT INTO orders (customer_id, status, amount, created_at) VALUES
(42, 'paid', 199.99, '2026-03-05 10:15:00+00'),
(42, 'paid', 49.00, '2026-03-04 18:20:00+00'),
(42, 'paid', 15.50, '2026-03-02 09:01:00+00'),
(42, 'paid', 120.00, '2026-02-28 12:33:00+00'),
(42, 'paid', 9.99, '2026-02-27 02:10:00+00');

-- Query: newest paid orders for a customer
SELECT order_id, amount, created_at
FROM orders
WHERE customer_id = 42
AND status = 'paid'
ORDER BY created_at DESC
LIMIT 5;

Expected output:

+----------+--------+---------------------------+
| order_id | amount | created_at |
+----------+--------+---------------------------+
| 1 | 199.99 | 2026-03-05 10:15:00+00 |
| 2 | 49.00 | 2026-03-04 18:20:00+00 |
| 3 | 15.50 | 2026-03-02 09:01:00+00 |
| 4 | 120.00 | 2026-02-28 12:33:00+00 |
| 5 | 9.99 | 2026-02-27 02:10:00+00 |
+----------+--------+---------------------------+

Explanation:

  • The index matches the filter and sort pattern.
  • The projection is narrow and stable.
  • CHECK constraints protect data quality.

Practical Use Cases

Use Case 1: Employee Directory (HR)

Business problem: Render a directory page with stable ordering and minimal columns.

SELECT employee_id, full_name, department
FROM employees
WHERE active = true
ORDER BY department, full_name;

Use Case 2: Debug One Record (Support)

Business problem: Quickly inspect a specific order for a customer.

SELECT order_id, customer_id, status, amount, created_at
FROM orders
WHERE order_id = 12345;

Use Case 3: Daily KPI Extract (Analytics)

Business problem: Produce daily order counts for the last 30 days.

SELECT date_trunc('day', created_at) AS day_key, COUNT(*) AS orders
FROM orders
WHERE created_at >= now() - interval '30 days'
GROUP BY 1
ORDER BY day_key;

Use Case 4: Data Quality Check (Pre-Constraint)

Business problem: Find rows missing a required value before enforcing NOT NULL.

SELECT customer_id
FROM customers
WHERE email IS NULL
LIMIT 20;

Use Case 5: Safe Preview Before a Write

Business problem: Preview the exact rows you intend to update.

SELECT order_id, status
FROM orders
WHERE status = 'pending'
ORDER BY created_at ASC
LIMIT 10;

Common Mistakes & Troubleshooting

Mistake 1: Missing FROM

Wrong SQL:

SELECT full_name, department;

Error message: ERROR: SELECT requires a FROM clause (or a syntax error depending on context)

Fix:

SELECT full_name, department
FROM employees;

Mistake 2: Relation does not exist

Wrong SQL:

SELECT * FROM employee;

Error message: ERROR: relation "employee" does not exist

Fix:

SELECT * FROM employees;

Mistake 3: SELECT alias referenced in WHERE

Wrong SQL:

SELECT salary / 12 AS monthly_salary
FROM employees
WHERE monthly_salary > 6000;

Bad outcome: ERROR: column "monthly_salary" does not exist

Fix (subquery):

SELECT *
FROM (
SELECT employee_id, full_name, salary / 12 AS monthly_salary
FROM employees
) s
WHERE s.monthly_salary > 6000;

Mistake 4: Assuming results are ordered

Wrong assumption:

SELECT employee_id, full_name
FROM employees;

Bad outcome: unstable pagination and inconsistent UI results.

Fix:

SELECT employee_id, full_name
FROM employees
ORDER BY employee_id;

Debugging Tips

  1. Start with SELECT ... LIMIT 10 to validate row shape and data.
  2. Verify tables with \dt and columns with \d table_name in psql.
  3. Add one clause at a time: WHERE -> ORDER BY -> LIMIT.
  4. If performance is slow, inspect the plan with EXPLAIN (module 18).

Best Practices

✅ Always select only the columns you need; avoid SELECT * in production.
✅ Always add ORDER BY when order matters (especially for pagination).
✅ Always include a stable tie-breaker in ORDER BY (primary key).
✅ Always keep filters index-friendly; avoid wrapping columns in functions unless you have expression indexes.
✅ Always use meaningful aliases for computed columns and report outputs.
❌ Avoid quoted identifiers unless required; they create case-sensitive object names.
❌ Avoid large OFFSET pagination on big tables; prefer keyset pagination.
❌ Avoid mixing AND/OR without parentheses; make boolean intent explicit.

Hands-On Practice

Exercise 1: Basic Selection (Easy)

Task: Select full_name and department for all employees.

-- Write your query here

Solution:

SELECT full_name, department
FROM employees
ORDER BY full_name;

Exercise 2: Computed Columns (Medium)

Task: Return full_name, salary, and round(salary / 12, 2) AS monthly_salary.

-- Write your query here

Solution:

SELECT
full_name,
salary,
round(salary / 12, 2) AS monthly_salary
FROM employees
ORDER BY monthly_salary DESC;

Exercise 3: Stable Ordering (Advanced)

Task: Return the first 3 active employees ordered by newest hire date, with a stable tie-breaker.

-- Write your query here

Solution:

SELECT employee_id, full_name, hire_date
FROM employees
WHERE active = true
ORDER BY hire_date DESC, employee_id DESC
LIMIT 3;

Connection to Other Concepts

Related conceptWhy it matters
PostgreSQL Data TypesCorrect types affect comparisons, ordering, and index usage
SELECT DISTINCTRemoves duplicates without grouping
WHERE ClauseFilters rows before returning results
ORDER BYDefines deterministic ordering and stable pagination
JoinsMost real queries select across multiple tables

Visual Learning Diagram

graph TB
A[Data Types] --> B[Table Structure]
B --> C[SELECT Statement]
C --> D[WHERE Clause]
C --> E[ORDER BY]
C --> F[LIMIT]
D --> G[Advanced Filtering]
E --> H[Stable Pagination]
C --> I[JOINs]

classDef allNodes fill:#3e3e3e,stroke:#ffffff,stroke-width:2px,color:#f5f5f5
classDef highlight fill:#3e3e3e,stroke:#ffffff,stroke-width:4px,color:#f5f5f5

class A,B,D,E,F,G,H,I allNodes
class C highlight

Common Pitfalls

PitfallConsequencePrevention
Using SELECT * for hot application queriesWider rows and extra I/OSelect only needed columns
Missing deterministic orderingDuplicate/missing rows in paginationUse ORDER BY + unique tie-breaker
Filtering on function-wrapped columnsIndex not used; slow scansRewrite predicate or add expression index
Quoted identifiers introduced accidentallyCase-sensitive object namesPrefer lowercase names; avoid quotes
Over-fetching rowsSlow endpoints and large transfersUse WHERE + LIMIT patterns

Quick Reference

SELECT col1, col2 FROM t;
SELECT col1 AS alias FROM t;
SELECT col, round(col / 12, 2) AS monthly FROM t;
SELECT * FROM t WHERE condition ORDER BY id DESC LIMIT 10;
SELECT col1, col2 FROM t ORDER BY col1 DESC, id DESC LIMIT 10;

What's Next