The Problem Statement
Imagine a scenario where you manage customer data for an e-commerce platform. Your goal is to determine:
How many customers are purchasing for the first time (new customers)?
How many customers are returning to make additional purchases (repeat customers)?
Concepts that will be covered while solving this problem:
Approach to complex query step by step.
Usage of CASE WHEN with SUM
Usage of common table expression (CTE)
Building Table
CREATE TABLE Query:
create table customer_orders (
order_id integer,
customer_id integer,
order_date date,
order_amount integer
);
Insert data into the table cutomer_orders:
insert into customer_orders values
(1,100,cast('2022-01-01' as date),2000),
(2,200,cast('2022-01-01' as date),2500),
(3,300,cast('2022-01-01' as date),2100),
(4,100,cast('2022-01-02' as date),2000),
(5,400,cast('2022-01-02' as date),2200),
(6,500,cast('2022-01-02' as date),2700),
(7,100,cast('2022-01-03' as date),3000),
(8,400,cast('2022-01-03' as date),1000),
(9,600,cast('2022-01-03' as date),3000);
Final INPUT Table:
Expected Output:
Solution:
Approach
If we know when a customer placed their first order, we can identify the earliest order date. We can comparing this first order date with subsequent order dates, we can ascertain whether the customer has placed previous orders or not. This can be achieved effectively using the aggregation function
MIN(order_date)
along withGROUP BY
.Now we need to check, if the order_date = first_visit_date then the customer arrived is for the first time. If the order_date ≠ first_visit_date then the cutomer arrived is an old customer.
In order to do that we will join the query used in the first step as a CTE and join back with the original table customer_orders ON customer_id
Use CASE statements in order to compare the order_date with the first_visit_date obtained from step 1
Aggregate the values by counting the new and repeat customers
Query using Two Comman Table Expression(CTE)
WITH
first_visit as
(select customer_id, MIN(order_date) as first_visit_date
from customer_orders
group by customer_id),
visit_flag as(
select c.*,f.first_visit_date,
CASE WHEN c.order_date=f.first_visit_date then 1 else 0 end as first_visit_flag,
CASE WHEN c.order_date!=f.first_visit_date then 1 else 0 end as repeat_visit_flag
from customer_orders c JOIN first_visit f
ON c.customer_id = f.customer_id)
select order_date,
sum(first_visit_flag) as no_of_new_customers,
sum(repeat_visit_flag) as no_of_repeat_customers
from visit_flag
group by order_date;
Query using One Comman Table Expression(CTE)
with
first_visit as
(select customer_id, min(order_date) as first_visit_date from customer_orders
group by customer_id)
select co.order_date,
SUM(CASE WHEN co.order_date=fv.first_visit_date then 1 else 0 end) as no_of_new_customers,
SUM(CASE WHEN co.order_date!=fv.first_visit_date then 1 else 0 end) as no_of_repeat_customers
from customer_orders co JOIN first_visit fv
ON co.customer_id = fv.customer_id
group by co.order_date;
Adding further usecase to the existing problem statement
Also find the amount generated by the new customers and from the repeat customers
Query for the above usecase:
with first_visit as
(select customer_id, min(order_date) as first_visit_date from customer_orders
group by customer_id)
select co.order_date,
SUM(CASE WHEN co.order_date=fv.first_visit_date then 1 else 0 end) as no_of_new_customers,
SUM(CASE WHEN co.order_date!=fv.first_visit_date then 1 else 0 end) as no_of_repeat_customers,
SUM(CASE WHEN co.order_date=fv.first_visit_date then co.order_amount else 0 end) as new_customers_amount,
SUM(CASE WHEN co.order_date!=fv.first_visit_date then co.order_amount else 0 end) as repeat_customers_amount
from customer_orders co JOIN first_visit fv
ON co.customer_id = fv.customer_id
group by co.order_date;
Output:
The key to SQL mastery lies in consistent practice and curiosity. As you dive deeper into SQL, challenge yourself with complex queries and optimize for efficiency. Stay tuned for Day 3 of SQL Interview Prep, where we’ll tackle even more exciting problems.
Happy querying! 🚀
Content reference: https://www.youtube.com/watch?v=MpAMjtvarrc&list=PLBTZqjSKn0IeKBQDjLmzisazhqQy4iGkb&index=2&ab_channel=AnkitBansal