What it is
A visual reference for understanding and using SQL JOIN operations to combine rows from two or more tables based on a related column.
Installation
This is a conceptual cheatsheet and does not require installation. You use SQL JOINs within your database management system (e.g., PostgreSQL, MySQL, SQL Server, SQLite).
Core Concepts
- Tables: The fundamental data structures in a relational database.
- Rows (Records): A single entry within a table.
- Columns (Fields): Attributes that define the type of data stored in a table.
- Primary Key: A column (or set of columns) that uniquely identifies each row in a table.
- Foreign Key: A column (or set of columns) in one table that refers to the primary key in another table, establishing a link between them.
- Join Condition: The
ONclause in a JOIN statement that specifies how rows from different tables should be matched, typically by comparing values in related columns (often primary and foreign keys).
Commands / Usage
The core of SQL JOINs is the SELECT statement with a JOIN clause. The syntax generally follows:
SELECT column_list
FROM table1
JOIN_TYPE table2 ON table1.column_name = table2.column_name;
INNER JOIN
Returns only the rows where the join condition is met in both tables. This is the most common type of join.
Visual:
Table A Table B
+---------+ +---------+
| 1 | | 1 |
| 2 | --> | 2 |
| 3 | | 4 |
+---------+ +---------+
Result will contain rows for 1 and 2.
Example: Get customers and their orders, only showing customers who have placed orders.
SELECT
c.customer_name,
o.order_date
FROM
customers c
INNER JOIN
orders o ON c.customer_id = o.customer_id;
LEFT JOIN (or LEFT OUTER JOIN)
Returns all rows from the left table, and the matched rows from the right table. If there is no match, the result is NULL on the right side.
Visual:
Table A Table B
+---------+ +---------+
| 1 | --> | 1 |
| 2 | | 2 |
| 3 | | 4 |
+---------+ +---------+
Result will contain rows for 1, 2, and 3. For row 3, the columns from Table B will be NULL.
Example: Get all customers and any orders they might have.
SELECT
c.customer_name,
o.order_date
FROM
customers c
LEFT JOIN
orders o ON c.customer_id = o.customer_id;
RIGHT JOIN (or RIGHT OUTER JOIN)
Returns all rows from the right table, and the matched rows from the left table. If there is no match, the result is NULL on the left side.
Visual:
Table A Table B
+---------+ +---------+
| 1 | <-- | 1 |
| 2 | | 2 |
| 3 | | 4 |
+---------+ +---------+
Result will contain rows for 1, 2, and 4. For row 4, the columns from Table A will be NULL.
Example: Get all orders and the customer who placed them, including orders that might not have a valid customer (e.g., data integrity issue).
SELECT
c.customer_name,
o.order_date
FROM
customers c
RIGHT JOIN
orders o ON c.customer_id = o.customer_id;
FULL JOIN (or FULL OUTER JOIN)
Returns all rows when there is a match in either the left or the right table. If there is no match, the missing side will have NULL values.
Visual:
Table A Table B
+---------+ +---------+
| 1 | <-> | 1 |
| 2 | | 2 |
| 3 | | 4 |
+---------+ +---------+
Result will contain rows for 1, 2, 3, and 4. Rows 1 and 2 will have data from both tables. Row 3 will have data only from Table A. Row 4 will have data only from Table B.
Example: Get all customers and all orders, showing matches where they exist and listing unmatched customers or orders separately.
SELECT
c.customer_name,
o.order_date
FROM
customers c
FULL JOIN
orders o ON c.customer_id = o.customer_id;
CROSS JOIN
Returns the Cartesian product of the two tables. This means every row from the first table is combined with every row from the second table. It does not use an ON clause. Use with caution, as it can produce very large result sets.
Visual:
Table A Table B
+---------+ +---------+
| 1 | --> | A |
| 2 | | B |
+---------+ +---------+
Result will contain: (1, A), (1, B), (2, A), (2, B).
Example: Combine every product with every available color.
SELECT
p.product_name,
clr.color_name
FROM
products p
CROSS JOIN
colors clr;
SELF JOIN
A table can be joined with itself. This is useful for querying hierarchical data or comparing rows within the same table.
Example: Find employees and their managers from an employees table where manager_id refers to another employee_id.
SELECT
e.employee_name,
m.employee_name AS manager_name
FROM
employees e
LEFT JOIN
employees m ON e.manager_id = m.employee_id;
Common Patterns
-
Finding records in one table that do not exist in another: Use a
LEFT JOINand filter forNULLvalues in the right table’s columns.-- Find customers who have NOT placed any orders SELECT c.customer_name FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL; -
Combining multiple tables (chaining joins): Join tables sequentially. The output of one join can be considered the "left" table for the next join.
-- Get customer name, order date, and product name for each item in an order SELECT c.customer_name, o.order_date, oi.quantity, p.product_name FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id INNER JOIN order_items oi ON o.order_id = oi.order_id INNER JOIN products p ON oi.product_id = p.product_id; -
Using aliases for clarity (essential for complex joins): Always use table aliases (
c,o,p) to make queries shorter and more readable, especially when column names might be ambiguous.SELECT cust.name AS customer_name, addr.street AS customer_street FROM customers cust INNER JOIN addresses addr ON cust.address_id = addr.address_id; -
Joining on multiple columns: When the relationship between tables involves more than one column.
-- Assuming a composite primary key or a link table SELECT t1.data, t2.info FROM table1 t1 INNER JOIN table2 t2 ON t1.col_a = t2.col_a AND t1.col_b = t2.col_b;
Gotchas
INNER JOINis the default: If you just writeJOINwithout specifyingINNER,LEFT,RIGHT, orFULL, most SQL dialects default toINNER JOIN.- Performance with
FULL JOIN:FULL JOINcan be resource-intensive. Sometimes, it’s more performant to achieve the same result using two separateLEFT JOINandRIGHT JOINqueries with aUNIONoperator. CROSS JOINwithoutONclause: Forgetting theONclause withINNER,LEFT,RIGHT, orFULLjoins will often implicitly turn it into aCROSS JOIN(or throw an error depending on the SQL dialect and settings), leading to unexpected and massive result sets.- Ambiguous column names: If tables have columns with the same name (e.g.,
id,name), you must qualify the column name with the table name or alias (e.g.,customers.id,c.id). - Data type mismatches in
ONclause: While some databases may perform implicit type casting, it’s best practice to ensure that the columns used in theONclause have compatible data types. Mismatches can lead to performance issues or incorrect results. NULLvalues in join conditions:NULLvalues are never equal to otherNULLvalues (or anything else, for that matter) using the standard=operator. Therefore, rows where the join columns containNULLwill not be matched by default inINNER JOINor other joins that rely on equality. If you need to matchNULLs, you might need specific syntax likeIS NOT DISTINCT FROM(supported by PostgreSQL) or handle them separately.