SQL Interview Prep - Day 6

Immediate Food Delivery

In this article, we’ll walk through solving a SQL problem where we need to calculate the percentage of "immediate" orders from the first orders of all customers. The problem requires us to identify which orders are considered "immediate" based on the customer's preferred delivery date matching the order date and then compute the percentage of such immediate orders among the first orders of all customers.

Problem Breakdown

We are given a table called Delivery with the following columns:

  • delivery_id: A unique identifier for each delivery.

  • customer_id: The ID of the customer making the order.

  • order_date: The date the customer placed the order.

  • customer_pref_delivery_date: The date the customer prefers to receive the delivery.

The key conditions for the problem:

  • An order is considered immediate if the order_date is the same as the customer_pref_delivery_date.

  • An order is considered scheduled if the order_date is different from the customer_pref_delivery_date.

  • The first order of a customer is defined as the order with the earliest order_date.

We need to calculate the percentage of immediate orders among the first orders placed by all customers, rounded to 2 decimal places.

Step-by-Step Approach

  1. Identify First Orders: The first order for each customer is the one with the earliest order date. This can be identified by using MIN(order_date) for each customer_id.

  2. Determine Immediate Orders: An order is considered immediate if the order_date is the same as the customer_pref_delivery_date. We can compare these two columns to determine whether an order is immediate or scheduled.

  3. Calculate Percentage of Immediate Orders: To calculate the percentage, we'll:

    • Count the number of immediate first orders.

    • Count the total number of first orders.

    • Divide the number of immediate first orders by the total number of first orders and multiply by 100 to get the percentage.

  4. Return the Result: We need to round the result to two decimal places, which can be done using the ROUND() function in SQL.

SQL Schema:

Create table If Not Exists Delivery (delivery_id int, customer_id int, order_date date, customer_pref_delivery_date date)
Truncate table Delivery
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('1', '1', '2019-08-01', '2019-08-02')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('2', '2', '2019-08-02', '2019-08-02')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('3', '1', '2019-08-11', '2019-08-12')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('4', '3', '2019-08-24', '2019-08-24')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('5', '3', '2019-08-21', '2019-08-22')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('6', '2', '2019-08-11', '2019-08-13')
insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('7', '4', '2019-08-09', '2019-08-09')

SQL Query Solution

WITH FirstOrders AS (
    -- Identify the first order for each customer (earliest order_date)
    SELECT customer_id, 
           MIN(order_date) AS first_order_date
    FROM Delivery
    GROUP BY customer_id
),
ImmediateOrders AS (
    -- Identify immediate orders by comparing order_date with customer_pref_delivery_date
    SELECT d.customer_id, d.delivery_id, d.order_date, d.customer_pref_delivery_date
    FROM Delivery d
    JOIN FirstOrders f ON d.customer_id = f.customer_id
    WHERE d.order_date = f.first_order_date
)
SELECT
    ROUND(
        (SUM(CASE WHEN io.order_date = io.customer_pref_delivery_date THEN 1 ELSE 0 END) / 
         COUNT(io.delivery_id)) * 100, 2
    ) AS immediate_percentage
FROM ImmediateOrders io;

Explanation of the Query:

  1. FirstOrders CTE:

    • This Common Table Expression (CTE) identifies the first order for each customer by selecting the minimum order_date for each customer. This gives us the earliest order placed by each customer.
    SELECT customer_id, MIN(order_date) AS first_order_date
    FROM Delivery
    GROUP BY customer_id
  1. ImmediateOrders CTE:

    • In this CTE, we join the Delivery table with the FirstOrders CTE on customer_id to filter only the first orders for each customer. We then check whether the order_date matches the customer_pref_delivery_date to determine if the order is "immediate."
    SELECT d.customer_id, d.delivery_id, d.order_date, d.customer_pref_delivery_date
    FROM Delivery d
    JOIN FirstOrders f ON d.customer_id = f.customer_id
    WHERE d.order_date = f.first_order_date
  1. Main SELECT:

    • The main query calculates the percentage of immediate orders. We use a CASE statement to count the number of immediate orders (where order_date = customer_pref_delivery_date). The total number of immediate orders is divided by the total number of first orders, and the result is multiplied by 100 to get the percentage.

    • The result is rounded to 2 decimal places using the ROUND() function.

    ROUND(
         (SUM(CASE WHEN io.order_date = io.customer_pref_delivery_date THEN 1 ELSE 0 END) / 
          COUNT(io.delivery_id)) * 100, 2
     ) AS immediate_percentage

Sample Input and Output

Input:

delivery_idcustomer_idorder_datecustomer_pref_delivery_date
112019-08-012019-08-02
222019-08-022019-08-02
312019-08-112019-08-12
432019-08-242019-08-24
532019-08-212019-08-22
622019-08-112019-08-13
742019-08-092019-08-09

Output:

immediate_percentage
50.00

Explanation of Output:

  • Customer 1: The first order (ID 1) is scheduled (order date is different from preferred date).

  • Customer 2: The first order (ID 2) is immediate (order date equals preferred date).

  • Customer 3: The first order (ID 5) is scheduled (order date is different from preferred date).

  • Customer 4: The first order (ID 7) is immediate (order date equals preferred date).

Thus, 2 out of 4 customers (50%) have immediate first orders.

Key SQL Concepts Used:

  • Common Table Expressions (CTEs): We use CTEs to organize the query logic into manageable parts, making it easier to identify the first orders and immediate orders.

  • JOINs: A JOIN is used to filter only the first order for each customer from the Delivery table.

  • Conditional Aggregation: The CASE statement inside the SUM function helps in counting only the immediate orders.

  • Mathematical Calculations: We calculate the percentage by dividing the number of immediate orders by the total number of first orders and multiplying by 100.

  • Rounding: The ROUND() function is used to round the result to two decimal places.

I hope this article helps you in understanding how to approach SQL queries involving CTEs, aggregation, JOINS, ROUND and Mathematical calculations.

Happy querying! 🙌🚀