Window Functions
Window functions compute values across a set of rows related to the current row without collapsing rows like GROUP BY. They are evaluated after WHERE, GROUP BY, and HAVING, but before ORDER BY and LIMIT.
Syntax
Section titled “Syntax”function_name(args) OVER ( [PARTITION BY partition_expr [, ...]] [ORDER BY sort_expr [ASC | DESC] [NULLS { FIRST | LAST }] [, ...]] [frame_clause])- PARTITION BY divides rows into independent groups (like
GROUP BYbut without collapsing). - ORDER BY defines the row ordering within each partition.
- frame_clause controls which rows in the partition are visible to the function.
Ranking Functions
Section titled “Ranking Functions”Ranking functions assign a position to each row within its partition based on the ORDER BY expression.
SELECT department, name, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank, DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank, NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile, PERCENT_RANK() OVER (PARTITION BY department ORDER BY salary) AS pct_rank, CUME_DIST() OVER (PARTITION BY department ORDER BY salary) AS cum_distFROM employees;Example Output
Section titled “Example Output”Given this data in the engineering department:
| name | salary |
|---|---|
| Alice | 150000 |
| Bob | 130000 |
| Carol | 130000 |
| Dave | 110000 |
The ranking functions produce:
| name | salary | row_num | rank | dense_rank | quartile |
|---|---|---|---|---|---|
| Alice | 150000 | 1 | 1 | 1 | 1 |
| Bob | 130000 | 2 | 2 | 2 | 1 |
| Carol | 130000 | 3 | 2 | 2 | 2 |
| Dave | 110000 | 4 | 4 | 3 | 3 |
Reference
Section titled “Reference”| Function | Description |
|---|---|
ROW_NUMBER() | Unique sequential integer per partition, no ties |
RANK() | Rank with gaps on ties (1, 2, 2, 4) |
DENSE_RANK() | Rank without gaps on ties (1, 2, 2, 3) |
NTILE(n) | Distribute rows into n roughly equal buckets |
PERCENT_RANK() | Relative rank: (rank - 1) / (total_rows - 1) |
CUME_DIST() | Cumulative distribution: fraction of rows at or below current |
Value Functions
Section titled “Value Functions”Value functions access a specific row within the window frame relative to the current row.
SELECT date, revenue, LAG(revenue, 1, 0) OVER (ORDER BY date) AS prev_day, LEAD(revenue, 1, 0) OVER (ORDER BY date) AS next_day, FIRST_VALUE(revenue) OVER ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS first_rev, LAST_VALUE(revenue) OVER ( ORDER BY date ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS last_rev, NTH_VALUE(revenue, 3) OVER (ORDER BY date) AS third_revFROM daily_revenue;Reference
Section titled “Reference”| Function | Description |
|---|---|
LAG(expr [, offset [, default]]) | Value from offset rows before current (default offset: 1) |
LEAD(expr [, offset [, default]]) | Value from offset rows after current (default offset: 1) |
FIRST_VALUE(expr) | First value in the window frame |
LAST_VALUE(expr) | Last value in the window frame |
NTH_VALUE(expr, n) | Value at position n in the window frame |
LAG / LEAD Examples
Section titled “LAG / LEAD Examples”Compare each day’s revenue to the previous and next day:
SELECT date, revenue, LAG(revenue) OVER (ORDER BY date) AS prev_day_revenue, revenue - LAG(revenue) OVER (ORDER BY date) AS day_over_day_changeFROM daily_revenueORDER BY date;Use a default value to avoid NULLs at partition boundaries:
SELECT date, revenue, LAG(revenue, 1, 0) OVER (ORDER BY date) AS prev_or_zeroFROM daily_revenue;FIRST_VALUE / LAST_VALUE Gotcha
Section titled “FIRST_VALUE / LAST_VALUE Gotcha”LAST_VALUE with the default frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) returns the current row’s value — not the last row in the partition. To get the true last value, extend the frame:
SELECT date, revenue, LAST_VALUE(revenue) OVER ( ORDER BY date ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS partition_lastFROM daily_revenue;Aggregate Window Functions
Section titled “Aggregate Window Functions”Any aggregate function can be used as a window function by adding an OVER clause. The aggregate is computed across the window frame instead of collapsing rows.
SELECT date, revenue, SUM(revenue) OVER (ORDER BY date) AS running_total, AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS week_avg, COUNT(*) OVER (PARTITION BY region) AS region_count, MIN(revenue) OVER (PARTITION BY region ORDER BY date) AS region_running_min, MAX(revenue) OVER (PARTITION BY region ORDER BY date) AS region_running_maxFROM daily_revenue;Running Total
Section titled “Running Total”SELECT date, amount, SUM(amount) OVER (ORDER BY date) AS cumulative_amountFROM transactions;Moving Average
Section titled “Moving Average”-- 7-day moving averageSELECT date, revenue, AVG(revenue) OVER ( ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS moving_avg_7dFROM daily_revenue;Per-Group Percentage
Section titled “Per-Group Percentage”SELECT department, name, salary, salary::FLOAT / SUM(salary) OVER (PARTITION BY department) * 100 AS pct_of_deptFROM employees;Window Frame Clause
Section titled “Window Frame Clause”The frame clause controls which rows within the partition are visible to the window function.
Syntax
Section titled “Syntax”{ ROWS | RANGE | GROUPS } BETWEEN frame_start AND frame_endFrame Boundaries
Section titled “Frame Boundaries”| Boundary | Meaning |
|---|---|
UNBOUNDED PRECEDING | First row of the partition |
n PRECEDING | n rows (or range units) before current |
CURRENT ROW | The current row |
n FOLLOWING | n rows (or range units) after current |
UNBOUNDED FOLLOWING | Last row of the partition |
ROWS vs RANGE vs GROUPS
Section titled “ROWS vs RANGE vs GROUPS”| Mode | Unit | Behavior |
|---|---|---|
ROWS | Physical row count | Counts individual rows regardless of value |
RANGE | Logical value range | Includes all rows with the same ORDER BY value as peers |
GROUPS | Peer groups | Counts distinct groups of tied ORDER BY values |
Default Frame
Section titled “Default Frame”When ORDER BY is specified and no frame clause is given, the default is:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWWhen no ORDER BY is specified, the entire partition is the frame.
Examples
Section titled “Examples”-- Running total (all rows up to current)SUM(x) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
-- 7-day moving averageAVG(x) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
-- Centered 5-row windowAVG(x) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
-- Entire partition (explicit)SUM(x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)Time-Series Window Functions
Section titled “Time-Series Window Functions”ParticleDB includes built-in window functions designed for time-series analysis. These operate on ordered partitions and are typically used with an ORDER BY on a timestamp column.
| Function | Description |
|---|---|
MOVING_AVG(expr [, window_size]) | Moving average over the last window_size rows |
MOVING_SUM(expr [, window_size]) | Moving sum over the last window_size rows |
EXP_MOVING_AVG(expr [, alpha]) | Exponential moving average with smoothing factor alpha (0 to 1) |
DELTA(expr) | Difference between the current row and the previous row |
RATE(expr) | Rate of change from the previous row |
SELECT timestamp, cpu_usage, MOVING_AVG(cpu_usage, 5) OVER (ORDER BY timestamp) AS avg_5, MOVING_SUM(cpu_usage, 5) OVER (ORDER BY timestamp) AS sum_5, EXP_MOVING_AVG(cpu_usage, 0.3) OVER (ORDER BY timestamp) AS ema, DELTA(cpu_usage) OVER (ORDER BY timestamp) AS change, RATE(cpu_usage) OVER (ORDER BY timestamp) AS rateFROM metrics;IoT Monitoring Example
Section titled “IoT Monitoring Example”SELECT sensor_id, timestamp, temperature, MOVING_AVG(temperature, 10) OVER ( PARTITION BY sensor_id ORDER BY timestamp ) AS smoothed_temp, DELTA(temperature) OVER ( PARTITION BY sensor_id ORDER BY timestamp ) AS temp_changeFROM sensor_readingsWHERE timestamp > NOW() - INTERVAL '1 hour';Named Windows
Section titled “Named Windows”Define a window once and reference it in multiple function calls. This avoids repeating the same PARTITION BY and ORDER BY clause.
SELECT department, name, salary, RANK() OVER w AS rank, SUM(salary) OVER w AS running_total, AVG(salary) OVER w AS running_avgFROM employeesWINDOW w AS (PARTITION BY department ORDER BY salary DESC);Multiple named windows can be defined:
SELECT region, date, revenue, SUM(revenue) OVER by_region AS region_total, RANK() OVER by_date AS date_rankFROM salesWINDOW by_region AS (PARTITION BY region), by_date AS (PARTITION BY region ORDER BY date);Combining Window Functions with Other Clauses
Section titled “Combining Window Functions with Other Clauses”Window functions are evaluated after WHERE, GROUP BY, and HAVING. You can use them on aggregated results:
-- Rank departments by total salary spendSELECT department, SUM(salary) AS total_salary, RANK() OVER (ORDER BY SUM(salary) DESC) AS spending_rankFROM employeesGROUP BY department;To filter on window function results, wrap the query in a subquery or CTE:
-- Top 3 earners per departmentWITH ranked AS ( SELECT department, name, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn FROM employees)SELECT department, name, salaryFROM rankedWHERE rn <= 3;Practical Examples
Section titled “Practical Examples”Year-over-Year Growth
Section titled “Year-over-Year Growth”SELECT year, annual_revenue, LAG(annual_revenue) OVER (ORDER BY year) AS prev_year, (annual_revenue - LAG(annual_revenue) OVER (ORDER BY year)) / LAG(annual_revenue) OVER (ORDER BY year) * 100 AS yoy_growth_pctFROM yearly_summary;Gaps and Islands
Section titled “Gaps and Islands”Find consecutive sequences of events:
WITH numbered AS ( SELECT event_date, status, ROW_NUMBER() OVER (ORDER BY event_date) - ROW_NUMBER() OVER (PARTITION BY status ORDER BY event_date) AS grp FROM events)SELECT status, MIN(event_date) AS start_date, MAX(event_date) AS end_date, COUNT(*) AS consecutive_daysFROM numberedGROUP BY status, grpORDER BY start_date;Sessionization
Section titled “Sessionization”WITH with_gap AS ( SELECT user_id, event_time, event_time - LAG(event_time) OVER ( PARTITION BY user_id ORDER BY event_time ) AS gap_seconds FROM clickstream),with_session AS ( SELECT *, SUM(CASE WHEN gap_seconds > 1800 OR gap_seconds IS NULL THEN 1 ELSE 0 END) OVER (PARTITION BY user_id ORDER BY event_time) AS session_id FROM with_gap)SELECT user_id, session_id, MIN(event_time) AS session_start, MAX(event_time) AS session_end, COUNT(*) AS eventsFROM with_sessionGROUP BY user_id, session_id;Performance Notes
Section titled “Performance Notes”- Window functions in ParticleDB use Rayon-based parallelism: each partition is evaluated independently across CPU cores.
ROW_NUMBER,RANK, andDENSE_RANKare computed in a single pass over the sorted partition.- Aggregate window functions with
ROWS BETWEEN n PRECEDING AND CURRENT ROWuse an incremental sliding-window algorithm (O(N) total, not O(N*K)). - For large result sets, adding a
PARTITION BYclause improves parallelism by allowing independent partitions to run on separate cores.