Joins
Joins combine rows from two or more tables based on a related column. ParticleDB uses a streaming hash join engine with parallel probe for large inputs.
JOIN Types at a Glance
Section titled “JOIN Types at a Glance”| Join Type | Returns |
|---|---|
INNER JOIN | Only rows that match in both tables |
LEFT JOIN | All rows from the left table, plus matches from the right (NULLs for non-matches) |
RIGHT JOIN | All rows from the right table, plus matches from the left (NULLs for non-matches) |
FULL OUTER JOIN | All rows from both tables (NULLs where no match exists on either side) |
CROSS JOIN | Every combination of rows from both tables (Cartesian product) |
NATURAL JOIN | Automatic INNER JOIN on all columns with the same name |
The examples below use these three tables:
CREATE TABLE customers ( id BIGINT PRIMARY KEY, name TEXT NOT NULL, tier TEXT DEFAULT 'standard');
CREATE TABLE orders ( id BIGINT PRIMARY KEY, customer_id BIGINT NOT NULL, total FLOAT64 NOT NULL, status TEXT DEFAULT 'pending');
CREATE TABLE order_items ( id BIGINT PRIMARY KEY, order_id BIGINT NOT NULL, product TEXT NOT NULL, quantity INT64 NOT NULL, unit_price FLOAT64 NOT NULL);
INSERT INTO customers VALUES (1, 'Alice', 'premium'), (2, 'Bob', 'standard'), (3, 'Carol', 'premium'), (4, 'Dave', 'standard');
INSERT INTO orders VALUES (100, 1, 250.00, 'shipped'), (101, 1, 75.00, 'delivered'), (102, 2, 150.00, 'pending'), (103, 5, 300.00, 'shipped'); -- customer_id 5 does not exist
INSERT INTO order_items VALUES (1, 100, 'Keyboard', 1, 89.99), (2, 100, 'Mouse', 2, 29.99), (3, 101, 'USB Cable', 3, 9.99), (4, 102, 'Monitor', 1, 149.99);INNER JOIN
Section titled “INNER JOIN”Returns only rows where the join condition matches in both tables. Rows without a match on either side are excluded.
SELECT c.name, o.id AS order_id, o.totalFROM customers cINNER JOIN orders o ON c.id = o.customer_id;| name | order_id | total |
|---|---|---|
| Alice | 100 | 250.00 |
| Alice | 101 | 75.00 |
| Bob | 102 | 150.00 |
Carol (no orders) and order 103 (customer_id 5 doesn’t exist) are both excluded.
JOIN without a qualifier defaults to INNER JOIN:
SELECT c.name, o.totalFROM customers cJOIN orders o ON c.id = o.customer_id;LEFT JOIN
Section titled “LEFT JOIN”Returns all rows from the left table. Where no match exists in the right table, right-side columns are filled with NULL.
SELECT c.name, o.id AS order_id, o.totalFROM customers cLEFT JOIN orders o ON c.id = o.customer_id;| name | order_id | total |
|---|---|---|
| Alice | 100 | 250.00 |
| Alice | 101 | 75.00 |
| Bob | 102 | 150.00 |
| Carol | NULL | NULL |
| Dave | NULL | NULL |
Use LEFT JOIN to find rows without matches:
-- Customers who have never placed an orderSELECT c.nameFROM customers cLEFT JOIN orders o ON c.id = o.customer_idWHERE o.id IS NULL;| name |
|---|
| Carol |
| Dave |
RIGHT JOIN
Section titled “RIGHT JOIN”The mirror of LEFT JOIN: all rows from the right table are kept, and left-side columns are NULL for non-matches.
SELECT c.name, o.id AS order_id, o.totalFROM customers cRIGHT JOIN orders o ON c.id = o.customer_id;| name | order_id | total |
|---|---|---|
| Alice | 100 | 250.00 |
| Alice | 101 | 75.00 |
| Bob | 102 | 150.00 |
| NULL | 103 | 300.00 |
Order 103 (customer_id 5) has no matching customer, so name is NULL.
FULL OUTER JOIN
Section titled “FULL OUTER JOIN”Returns all rows from both tables, with NULLs on the side where there is no match.
SELECT c.name, o.id AS order_id, o.totalFROM customers cFULL OUTER JOIN orders o ON c.id = o.customer_id;| name | order_id | total |
|---|---|---|
| Alice | 100 | 250.00 |
| Alice | 101 | 75.00 |
| Bob | 102 | 150.00 |
| Carol | NULL | NULL |
| Dave | NULL | NULL |
| NULL | 103 | 300.00 |
CROSS JOIN
Section titled “CROSS JOIN”Produces the Cartesian product — every row from the left table paired with every row from the right. No join condition is specified.
SELECT c.name, t.labelFROM customers cCROSS JOIN (VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4')) AS t(label);This returns 4 customers x 4 quarters = 16 rows.
An equivalent syntax uses comma-separated FROM:
SELECT c.name, t.labelFROM customers c, (VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4')) AS t(label);NATURAL JOIN
Section titled “NATURAL JOIN”Automatically joins on all columns with matching names in both tables. Eliminates duplicate columns from the result.
-- If both tables have an "id" column, NATURAL JOIN matches on id-- This is rarely used in practice -- explicit ON/USING is preferredSELECT * FROM employees NATURAL JOIN departments;Caution: NATURAL JOIN can produce unexpected results if tables share column names that shouldn’t be join keys. Prefer explicit ON or USING clauses.
USING Clause
Section titled “USING Clause”When the join column has the same name in both tables, USING is a cleaner alternative to ON:
-- These two are equivalent:SELECT * FROM orders JOIN order_items ON orders.id = order_items.order_id;
-- Using USING when the column name matches-- (this requires the column to have the same name in both tables)CREATE TABLE order_items_v2 ( id BIGINT PRIMARY KEY, id BIGINT NOT NULL, -- same name as orders.id product TEXT);
SELECT * FROM orders JOIN order_items_v2 USING (id);In practice, USING is most useful when foreign key columns share names:
SELECT c.name, o.totalFROM customers cJOIN orders o USING (customer_id);-- Equivalent to: ON c.customer_id = o.customer_idMulti-Table Joins
Section titled “Multi-Table Joins”Chain joins to combine three or more tables:
-- Full order details: customer + order + itemsSELECT c.name AS customer, o.id AS order_id, o.status, oi.product, oi.quantity, oi.unit_price, oi.quantity * oi.unit_price AS line_totalFROM customers cJOIN orders o ON c.id = o.customer_idJOIN order_items oi ON o.id = oi.order_idORDER BY c.name, o.id, oi.product;| customer | order_id | status | product | quantity | unit_price | line_total |
|---|---|---|---|---|---|---|
| Alice | 100 | shipped | Keyboard | 1 | 89.99 | 89.99 |
| Alice | 100 | shipped | Mouse | 2 | 29.99 | 59.98 |
| Alice | 101 | delivered | USB Cable | 3 | 9.99 | 29.97 |
| Bob | 102 | pending | Monitor | 1 | 149.99 | 149.99 |
Mixing Join Types
Section titled “Mixing Join Types”-- All customers, their orders (if any), and order items (if any)SELECT c.name, o.id AS order_id, oi.productFROM customers cLEFT JOIN orders o ON c.id = o.customer_idLEFT JOIN order_items oi ON o.id = oi.order_idORDER BY c.name;Self Join
Section titled “Self Join”Join a table to itself to compare rows within the same table:
-- Find employees who earn more than their managerSELECT e.name AS employee, e.salary AS employee_salary, m.name AS manager, m.salary AS manager_salaryFROM employees eJOIN employees m ON e.manager_id = m.idWHERE e.salary > m.salary;Join with Aggregation
Section titled “Join with Aggregation”Combine joins with GROUP BY to compute per-group metrics across tables:
-- Revenue per customerSELECT c.name, c.tier, COUNT(o.id) AS order_count, COALESCE(SUM(o.total), 0) AS total_revenueFROM customers cLEFT JOIN orders o ON c.id = o.customer_idGROUP BY c.name, c.tierORDER BY total_revenue DESC;| name | tier | order_count | total_revenue |
|---|---|---|---|
| Alice | premium | 2 | 325.00 |
| Bob | standard | 1 | 150.00 |
| Carol | premium | 0 | 0 |
| Dave | standard | 0 | 0 |
Performance Notes
Section titled “Performance Notes”- ParticleDB automatically picks the smaller table as the build side of the hash join
- For inputs larger than 100K rows, the probe phase runs in parallel across CPU cores
- The optimizer reorders joins for multi-way joins, placing the smallest tables first
- Int64 key specialization: joins on integer keys use a fast path that avoids hashing overhead
- For vector similarity joins, see Vector Search