Let’s start with the query which was suggested by a Sports Analyst. Now and again, their analytics team must create a points table based on the full matches played between different teams.
And that’s why they ask this query in every data analyst job interview.
This can be a classic scenario of converting a table from a long-form (variety of rows > variety of columns ) to a large form (variety of columns > variety of rows). This can be called , which is a vital use-case of CASE WHEN in SQL.
On this scenario, you’ve got a table containing the names of the teams who played the match and the winner. It is advisable create a points table where you get details about what number of matches each team played, what number of matches they won, lost and what number of matches draw.
Let’s see find out how to tackle this type of query —
Here is the input table where each row belongs to a match between two teams and the column winner indicates which team won. The worth NULL within the winner column indicates that the match was a draw i.e. not one of the team won the match.
Let’s divide this query into the next sub-tasks.
- Find the full variety of matches each team won
- Find the full variety of matches each team lost
- Find the full variety of matches where not one of the teams won
- Find the full variety of matches each team played
To grasp what number of matches a team won, it’s essential to understand for every match which team won the match. You’ll be able to do that by comparing the team_1 and team_2 columns with the winner column.
So, for a selected row when the values in columns team_1 and winner are equal then team_1 is the winner.
You’ll be able to translate the exact same logic using CASE..WHEN..THEN
statement in SQL as shown below.
SELECT team_1
, team_2
, winner
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
As shown within the above query, you’ll create a further column win_flag. When a team is a winner, you’ll assign the worth 1 to this column. Similarly, if the winner column is NULL, then you definitely’ll assign the worth 1 to the draw_flag column.
So the above query will create the next output for all of the teams within the column team_1.
Similarly, when the values within the columns team_2 and winner are equal then team_2 is the winner. So you possibly can write precisely the similar query for all of the teams in team_2
SELECT team_1
, team_2
, winner
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
where you’ll get the next output for the values in team_2
Well, the above two queries are only on your understanding. In point of fact, you possibly can create a single for the win_flag and draw_flag of every team within the columns team_1 and team_2 as shown below.
WITH win_draw_flag AS
(
SELECT team_1 as team
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
UNION ALL
SELECT team_2 as team
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
)
It will create a CTE like this — I’ve shown this only on your understanding.
Remember you might be still within the long type of the table and also you now have information on whether or not each team wins the match.
Next, it’s essential to simply aggregate the columns to get the full variety of matches each team played, won, and lost. You’ll be able to do it as simply as the next query.
SELECT team
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM win_draw_flag
GROUP BY team
ORDER BY team
Where COUNT(*)
gives you the full variety of times each team occurred within the CTE win_draw_flag and subtracting matches won & draw from it offers you the full variety of matches each team lost.
Without making a CTE individually, you can too write the query just like the following and pass all the CASE..WHEN query as a sub-query.
SELECT team
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM
(
SELECT team_1 as team
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
UNION ALL
SELECT team_2 as team
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
) AS win_draw_flag
GROUP BY team
ORDER BY team
It is going to also lead to the exact same output as mentioned above.
Well, there might be multiple ways to resolve this query — I discovered this approach easier. When you find every other solution to this query, be at liberty to say it within the comments.
That is one in all the classic examples of comparing dates after which implementing If..Else
logic using CASE..WHEN
statement. You’ll be able to encounter any such problem in any company that deals with customers.
The scenario is — You may have an eCommerce website where every single day customers visit and buy products. Your task is to discover on every day how many purchasers were recent and the way many purchasers were repeated.
Here is an input table — orders — where you possibly can see the shoppers with customer_id ABC101, BCD201, and ABD101 visited the web site on multiple days and purchased different products.
Let’s break down the query into the next sub-tasks —
- Find the primary time i.e. the primary date when the client visited the web site
- Compare the primary date with the order date to determine whether the client is a repeated or first-time visitor
You’ll be able to easily solve the primary sub-task through the use of GROUP BY to group all of the records by customer_id and find the minimum of the order_date, as shown below.
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
This was easy!
Next, to match the first_order_date with each order_date, first, it’s essential to bring each columns in a single table.
You’ll be able to easily do this using JOIN
on customer_id as shown below. Here you possibly can create a CTE using the above query so that you’re going to get a brief table to affix with the input table.
WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)SELECT t1.*
, t2.first_order_date
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id
Now, as you bought each the columns in a single table, you possibly can compare order_date with the first_order_date and implement the next If..Else
logic.
- If first_order_date & order_date are the identical then the client is the brand new customer
- If first_order_date & order_date are different then the client is repeated customer
So ideally, it’s essential to create two columns to implement the above two If..Else
statements using CASE WHEN in SQL.
You don’t must create any separate table, relatively you possibly can add two more columns within the above query where you joined two tables. Here is how it could possibly be done.
WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id
In consequence, the column new_customer_flag might be 1 when the columns first_order_date and order_date are equal. Similarly, the column repeat_customer_flag might be 1 when the columns first_order_date and order_date are different.
Now the last step is just to group all of the records by order date and sum up the columns new_customer_flag and repeat_customer_flag.
To do that, you’ll need the above table which you possibly can achieve by creating one other CTE as shown below.
WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
),customers AS
(
SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id
)
SELECT order_date
, SUM(new_customer_flag) AS number_of_new_customers
, SUM(repeat_customer_flag) AS number_of_repeat_customers
FROM customers
GROUP BY order_date
ORDER BY order_date
That is the way you’ll get the required output. You’ll be able to cross-check the outcomes by comparing them with the input table.
Again, you possibly can have a distinct approach to solving this query — that is the only approach I discovered. Don’t forget to say your approach within the comments below.
… [Trackback]
[…] Read More on on that Topic: bardai.ai/artificial-intelligence/2-vital-sql-case-when-examples-you-need-to-know-in-2023example-1-create-a-points-table-for-sport-tournamentexample-2-find-recent-and-repeated-customers-for-an-ecommerce-website/ […]
… [Trackback]
[…] Find More on that Topic: bardai.ai/artificial-intelligence/2-vital-sql-case-when-examples-you-need-to-know-in-2023example-1-create-a-points-table-for-sport-tournamentexample-2-find-recent-and-repeated-customers-for-an-ecommerce-website/ […]
… [Trackback]
[…] Read More to that Topic: bardai.ai/artificial-intelligence/2-vital-sql-case-when-examples-you-need-to-know-in-2023example-1-create-a-points-table-for-sport-tournamentexample-2-find-recent-and-repeated-customers-for-an-ecommerce-website/ […]
… [Trackback]
[…] Read More Information here on that Topic: bardai.ai/artificial-intelligence/2-vital-sql-case-when-examples-you-need-to-know-in-2023example-1-create-a-points-table-for-sport-tournamentexample-2-find-recent-and-repeated-customers-for-an-ecommerce-w…
… [Trackback]
[…] Find More Information here to that Topic: bardai.ai/artificial-intelligence/2-vital-sql-case-when-examples-you-need-to-know-in-2023example-1-create-a-points-table-for-sport-tournamentexample-2-find-recent-and-repeated-customers-for-an-ecommerce-w…