Introduction
As the name suggests we define a window and then perform our operations based on that window.
However similar work is done by Group By clause as well, so when should we use window functions
The main difference between window functions and Group By is what do we want to display in the output.
When use Group By what we are doing is we are reducing the number of rows in the output and the number of rows will be equal to the number of groups made by Group By
However when we use window functions we do not reduce the number of rows,It will remain the same but we will perform the operation on the window the we have selected.
Over clause
The
OVER
clause is a crucial component of window functions in SQL. It defines the window or subset of rows within a result set that a specific window function operates on.The
OVER
clause typically consists of three main components: partitioning, ordering, and framing.By default, the if we only mention over it will take in the entire table into consideration as the window
To restrict the window size we use the Partition over by clause in synonym with the over clause
Using partition by with over clause
When you use the
PARTITION BY
clause in conjunction with theOVER
clause and a window function, you're essentially dividing the result set into separate partitions, and the window function is applied independently to each partition.This can be extremely useful when you want to perform calculations on different groups of data within your dataset.
Here's an example to illustrate how to use the
PARTITION BY
clause with theOVER
clause: Suppose you have a table named "orders" with columns "order_date", "product_id", and "revenue". You want to calculate the total revenue for each product within different time periods.
sqlCopy codeSELECT
order_date,
product_id,
revenue,
SUM(revenue) OVER (PARTITION BY product_id ORDER BY order_date) AS total_revenue_per_product
FROM
orders;
In this query:
The
PARTITION BY
clause is used to partition the data by the "product_id" column.The rows within each partition are ordered by the "order_date" column.
The
SUM
window function calculates the cumulative sum of "revenue" for each product within its own partition and ordered by "order_date."
As a result, the query returns a result set where each row shows the original "order_date", "product_id", "revenue," and the calculated "total_revenue_per_product" which is the cumulative sum of revenue for that product within its respective partition.
This approach allows you to perform calculations on distinct groups of data within your dataset, which is particularly helpful for tasks like calculating running totals, rankings, and more, for each subgroup of data defined by the partitioning.