Calculating the Fraction of Players Who Logged In Consecutively Using SQL
As user engagement becomes an increasingly important metric for online platforms, understanding how users interact with your application over time can provide valuable insights. In this blog, we’ll walk you through how to calculate the fraction of players who logged in on consecutive days, starting from their first login, using SQL.
Problem Overview
Let’s imagine you are managing a gaming platform that tracks player activities. You have a table, Activity
, that logs when a player logs into your platform. The structure of the table is as follows:
Column Name | Type |
player_id | int |
device_id | int |
event_date | date |
games_played | int |
The combination of
player_id
andevent_date
forms a unique record for each login.The objective is to calculate the fraction of players who logged in the day after their first login.
The result should be the number of such players divided by the total number of players who logged in, and the final output should be rounded to two decimal places.
SQL Schema:
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
Truncate table Activity
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5')
Solution Outline in SQL:
We can approach this with a SQL query that performs the following:
Use
MIN
to find the first login date for each player.Use a
JOIN
orLEFT JOIN
to check if the player logged in on the day after their first login date.Use
COUNT
to determine how many players meet the criteria (logged in again the day after).Calculate the fraction and round it to 2 decimal places.
Here is the SQL query that implements this:
WITH FirstLogin AS (
-- Get the first login date for each player
SELECT player_id, MIN(event_date) AS first_login_date
FROM Activity
GROUP BY player_id
),
ConsecutiveLogin AS (
-- Get the players who logged in the day after their first login date
SELECT f.player_id
FROM FirstLogin f
JOIN Activity a ON f.player_id = a.player_id
WHERE a.event_date = DATE_ADD(f.first_login_date, INTERVAL 1 DAY)
)
-- Calculate the fraction
SELECT ROUND(COUNT(DISTINCT c.player_id) / COUNT(DISTINCT a.player_id), 2) AS fraction
FROM Activity a
LEFT JOIN ConsecutiveLogin c ON a.player_id = c.player_id;
SQL Solution Breakdown
Let’s dive into the solution step-by-step.
1. Identify the First Login Date for Each Player
The first step is to find out when each player logged in for the first time. We can do this using SQL's MIN()
function, which will return the earliest login date for each player.
Here’s how we can write this query:
WITH FirstLogin AS (
SELECT player_id, MIN(event_date) AS first_login_date
FROM Activity
GROUP BY player_id
)
Explanation:
MIN(event_date)
retrieves the earliest date on which each player logged in.GROUP BY player_id
groups the data by player so that we get the first login for each one.
For example, if Player 1 first logged in on 2016-03-01
, Player 2 on 2017-06-25
, and Player 3 on 2016-03-02
, the result of the FirstLogin
CTE would look like this:
player_id | first_login_date |
1 | 2016-03-01 |
2 | 2017-06-25 |
3 | 2016-03-02 |
2. Find Players Who Logged In on the Day After Their First Login
Now, for each player, we need to check if they logged in the day after their first login. To do this, we can join the FirstLogin
CTE with the Activity
table and check if the player's event_date
is one day after their first login date.
Here’s how you can do that:
ConsecutiveLogin AS (
SELECT f.player_id
FROM FirstLogin f
JOIN Activity a ON f.player_id = a.player_id
WHERE a.event_date = DATE_ADD(f.first_login_date, INTERVAL 1 DAY)
)
Explanation:
DATE_ADD(f.first_login_date, INTERVAL 1 DAY)
adds one day to the player's first login date.We then check if the player logged in on that day by matching
a.event_date
with the date one day after their first login.
The ConsecutiveLogin
CTE will contain only those players who logged in on the day after their first login.
3. Calculate the Fraction of Players Who Logged In Consecutively
Now that we know which players logged in consecutively, we can calculate the fraction of players who logged in consecutively. To do this:
Count the number of distinct players in
ConsecutiveLogin
(the players who logged in on consecutive days).Count the total number of distinct players in the
Activity
table (the total number of players who logged in at least once).
Here’s how you can calculate this fraction:
SELECT ROUND(COUNT(DISTINCT c.player_id) / COUNT(DISTINCT a.player_id), 2) AS fraction
FROM Activity a
LEFT JOIN ConsecutiveLogin c ON a.player_id = c.player_id;
Explanation:
COUNT(DISTINCT c.player_id)
counts the number of distinct players who logged in consecutively (i.e., players inConsecutiveLogin
).COUNT(DISTINCT a.player_id)
counts all distinct players who logged in at least once.The
LEFT JOIN
ensures we count all players, even those who didn't log in consecutively, which is essential for the denominator.
The ROUND()
function rounds the result to two decimal places as required.
Example Walkthrough
Let’s consider an example to understand the query better.
Given the following Activity
table:
player_id | device_id | event_date | games_played |
1 | 2 | 2016-03-01 | 5 |
1 | 2 | 2016-03-02 | 6 |
2 | 3 | 2017-06-25 | 1 |
3 | 1 | 2016-03-02 | 0 |
3 | 4 | 2018-07-03 | 5 |
FirstLogin CTE:
Player 1's first login is
2016-03-01
.Player 2's first login is
2017-06-25
.Player 3's first login is
2016-03-02
.
ConsecutiveLogin CTE:
Player 1 logged in again on
2016-03-02
, which is the day after their first login on2016-03-01
. So Player 1 is included inConsecutiveLogin
.Player 2 didn’t log in the next day, so they aren’t included.
Player 3 didn’t log in the next day, so they aren’t included either.
Final Calculation:
Only Player 1 logged in consecutively.
There are 3 distinct players in total.
The fraction is:
1 / 3 = 0.3331
So, the result is 0.33
, meaning 33% of the players logged in the day after their first login.
Conclusion
In this article, we’ve walked through how to calculate the fraction of players who logged in the day after their first login. By using SQL's WITH
clauses (CTEs), JOIN
operations, and date functions like DATE_ADD
, we can efficiently solve this problem and extract valuable insights about user behavior.
This method is highly applicable not just for gaming platforms, but for any application that tracks user activity and aims to measure engagement over time.
Happy querying, and feel free to reach out with any questions or feedback!