In this article, we’ll walk through solving a SQL problem where we need to calculate the percentage of "immediate" orders from the first orders of all customers. The problem requires us to identify which orders are considered "immediate" based on the customer's preferred delivery date matching the order date and then compute the percentage of such immediate orders among the first orders of all customers.
Problem Breakdown
We are given a table called Delivery with the following columns:
delivery_id: A unique identifier for each delivery.
customer_id: The ID of the customer making the order.
order_date: The date the customer placed the order.
customer_pref_delivery_date: The date the customer prefers to receive the delivery.
The key conditions for the problem:
An order is considered immediate if the order_date is the same as the customer_pref_delivery_date.
An order is considered scheduled if the order_date is different from the customer_pref_delivery_date.
The first order of a customer is defined as the order with the earliest order_date.
We need to calculate the percentage of immediate orders among the first orders placed by all customers, rounded to 2 decimal places.
Step-by-Step Approach
Identify First Orders: The first order for each customer is the one with the earliest order date. This can be identified by using MIN(order_date) for each customer_id.
Determine Immediate Orders: An order is considered immediate if the order_date is the same as the customer_pref_delivery_date. We can compare these two columns to determine whether an order is immediate or scheduled.
Calculate Percentage of Immediate Orders: To calculate the percentage, we'll:
Count the number of immediate first orders.
Count the total number of first orders.
Divide the number of immediate first orders by the total number of first orders and multiply by 100 to get the percentage.
Return the Result: We need to round the result to two decimal places, which can be done using the ROUND() function in SQL.
SQL Schema:
Create table If Not Exists Delivery (delivery_id int, customer_id int, order_date date, customer_pref_delivery_date date)
Truncate table Delivery
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('1', '1', '2019-08-01', '2019-08-02')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('2', '2', '2019-08-02', '2019-08-02')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('3', '1', '2019-08-11', '2019-08-12')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('4', '3', '2019-08-24', '2019-08-24')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('5', '3', '2019-08-21', '2019-08-22')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('6', '2', '2019-08-11', '2019-08-13')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('7', '4', '2019-08-09', '2019-08-09')
SQL Query Solution
WITH FirstOrders AS (
-- Identify the first order for each customer (earliest order_date)
SELECT customer_id,
MIN(order_date) AS first_order_date
FROM Delivery
GROUP BY customer_id
),
ImmediateOrders AS (
-- Identify immediate orders by comparing order_date with customer_pref_delivery_date
SELECT d.customer_id, d.delivery_id, d.order_date, d.customer_pref_delivery_date
FROM Delivery d
JOIN FirstOrders f ON d.customer_id = f.customer_id
WHERE d.order_date = f.first_order_date
)
SELECT
ROUND(
(SUM(CASE WHEN io.order_date = io.customer_pref_delivery_date THEN 1 ELSE 0 END) /
COUNT(io.delivery_id)) * 100, 2
) AS immediate_percentage
FROM ImmediateOrders io;
Explanation of the Query:
FirstOrders CTE:
- This Common Table Expression (CTE) identifies the first order for each customer by selecting the minimum order_date for each customer. This gives us the earliest order placed by each customer.
SELECT customer_id, MIN(order_date) AS first_order_date
FROM Delivery
GROUP BY customer_id
ImmediateOrders CTE:
- In this CTE, we join the Delivery table with the FirstOrders CTE on
customer_id
to filter only the first orders for each customer. We then check whether the order_date matches the customer_pref_delivery_date to determine if the order is "immediate."
- In this CTE, we join the Delivery table with the FirstOrders CTE on
SELECT d.customer_id, d.delivery_id, d.order_date, d.customer_pref_delivery_date
FROM Delivery d
JOIN FirstOrders f ON d.customer_id = f.customer_id
WHERE d.order_date = f.first_order_date
Main SELECT:
The main query calculates the percentage of immediate orders. We use a
CASE
statement to count the number of immediate orders (whereorder_date = customer_pref_delivery_date
). The total number of immediate orders is divided by the total number of first orders, and the result is multiplied by 100 to get the percentage.The result is rounded to 2 decimal places using the
ROUND()
function.
ROUND(
(SUM(CASE WHEN io.order_date = io.customer_pref_delivery_date THEN 1 ELSE 0 END) /
COUNT(io.delivery_id)) * 100, 2
) AS immediate_percentage
Sample Input and Output
Input:
delivery_id | customer_id | order_date | customer_pref_delivery_date |
1 | 1 | 2019-08-01 | 2019-08-02 |
2 | 2 | 2019-08-02 | 2019-08-02 |
3 | 1 | 2019-08-11 | 2019-08-12 |
4 | 3 | 2019-08-24 | 2019-08-24 |
5 | 3 | 2019-08-21 | 2019-08-22 |
6 | 2 | 2019-08-11 | 2019-08-13 |
7 | 4 | 2019-08-09 | 2019-08-09 |
Output:
immediate_percentage |
50.00 |
Explanation of Output:
Customer 1: The first order (ID 1) is scheduled (order date is different from preferred date).
Customer 2: The first order (ID 2) is immediate (order date equals preferred date).
Customer 3: The first order (ID 5) is scheduled (order date is different from preferred date).
Customer 4: The first order (ID 7) is immediate (order date equals preferred date).
Thus, 2 out of 4 customers (50%) have immediate first orders.
Key SQL Concepts Used:
Common Table Expressions (CTEs): We use CTEs to organize the query logic into manageable parts, making it easier to identify the first orders and immediate orders.
JOINs: A
JOIN
is used to filter only the first order for each customer from theDelivery
table.Conditional Aggregation: The
CASE
statement inside theSUM
function helps in counting only the immediate orders.Mathematical Calculations: We calculate the percentage by dividing the number of immediate orders by the total number of first orders and multiplying by 100.
Rounding: The
ROUND()
function is used to round the result to two decimal places.
I hope this article helps you in understanding how to approach SQL queries involving CTEs, aggregation, JOINS, ROUND and Mathematical calculations.
Happy querying! 🙌🚀